Of late I have been obsessed with computer vision. This is in part due to my ambition of creating my own butler and the 3d scanner project. What this led to was a long and extensive study of the mathematics involved behind computer vision.
After some days of searching I discovered the git repository of OpenCV. A wonderful library full of interesting mathematical features and so on. Since there was no simple pip install as is the case with most non-trivial installations, I spent quiet some time building and installing this piece of code.
Once installed I was at a complete loss of knowledge because every possible documentation was for C/C++. I could not find any(partly because I was not using google. I use duckduckgo.) After a while I did find some documentation and it was quiet fun.
About half an hour of understanding the math and finally moving on to the code I began with getting the webcam feed to show up.
After some days of searching I discovered the git repository of OpenCV. A wonderful library full of interesting mathematical features and so on. Since there was no simple pip install as is the case with most non-trivial installations, I spent quiet some time building and installing this piece of code.
Once installed I was at a complete loss of knowledge because every possible documentation was for C/C++. I could not find any(partly because I was not using google. I use duckduckgo.) After a while I did find some documentation and it was quiet fun.
About half an hour of understanding the math and finally moving on to the code I began with getting the webcam feed to show up.
import cv,cv2With this I had a live feed working.Now came the part where I had to detect my face in the frames obtained. Hence with a few documentation snippets and code from here and there I had the following.
def get_live_feed():
window=cv.NamedWindow('live',0)
#calibrate the camera
#required to adjust for lighting
for i in range(10):
img=cv.QueryFrame(cam)
#capture and show the feed
while True:
img=cv.QueryFrame(cam)
if img!=0:
cv.ShowImage('live',img)
c=cv.WaitKey(10)
if c==27:break
cv.DestroyWindow('live')
if __name__=='__main__':
get_live_feed()
import cv2
import sys
casc = sys.argv[1]
faceCascade = cv2.CascadeClassifier(casc)
video_capture = cv2.VideoCapture(0)
while True:
ret, frame = video_capture.read()
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
faces = faceCascade.detectMultiScale(
gray,
scaleFactor=1.1,
minNeighbors=5,
minSize=(30, 30),
flags=cv2.cv.CV_HAAR_SCALE_IMAGE
)
# Draw a rectangle around the faces
for (x, y, w, h) in faces:
cv2.rectangle(frame, (x, y), (x+w, y+h), (0, 255, 0), 2)
# Display the resulting frame
cv2.imshow('Video', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
# When everything is done, release the capture
video_capture.release()
cv2.destroyAllWindows()
That led to the following video being created. The vision is still far off from what the Butler must see , I will probably teach it to recognize other objects like keys etc. Also after face detection comes the task of face recognition. Expect a post soon on such a topic.The thing is a little off but works fine generally speaking.
No comments:
Post a Comment