Sunday, January 29, 2017

Improving Photo Search A Step Across the Semantic Gap

,


Last month at Google I/O, we showed a major upgrade to the photos experience: you can now easily search your own photos without having to manually label each and every one of them. This is powered by computer vision and machine learning technology, which uses the visual content of an image to generate searchable tags for photos combined with other sources like text tags and EXIF metadata to enable search across thousands of concepts like a flower, food, car, jet ski, or turtle.

For many years Google has offered Image Search over web images; however, searching across photos represents a difficult new challenge. In Image Search there are many pieces of information which can be used for ranking images, for example text from the web or the image filename. However, in the case of photos, there is typically little or no information beyond the pixels in the images themselves. This makes it harder for a computer to identify and categorize what is in a photo. There are some things a computer can do well, like recognize rigid objects and handwritten digits. For other classes of objects, this is a daunting task, because the average toddler is better at understanding what is in a photo than the world’s most powerful computers running state of the art algorithms.

This past October the state of the art seemed to move things a bit closer to toddler performance. A system which used deep learning and convolutional neural networks easily beat out more traditional approaches in the ImageNet computer vision competition designed to test image understanding. The winning team was from Professor Geoffrey Hinton’s group at the University of Toronto.

We built and trained models similar to those from the winning team using software infrastructure for training large-scale neural networks developed at Google in a group started by Jeff Dean and Andrew Ng. When we evaluated these models, we were impressed; on our test set we saw double the average precision when compared to other approaches we had tried. We knew we had found what we needed to make photo searching easier for people using Google. We acquired the rights to the technology and went full speed ahead adapting it to run at large scale on Google’s computers. We took cutting edge research straight out of an academic research lab and launched it, in just a little over six months. You can try it out at photos.google.com.

Why the success now? What is new? Some things are unchanged: we still use convolutional neural networks -- originally developed in the late 1990s by Professor Yann LeCun in the context of software for reading handwritten letters and digits. What is different is that both computers and algorithms have improved significantly. First, bigger and faster computers have made it feasible to train larger neural networks with much larger data. Ten years ago, running neural networks of this complexity would have been a momentous task even on a single image -- now we are able to run them on billions of images. Second, new training techniques have made it possible to train the large deep neural networks necessary for successful image recognition.

We feel it would be interesting to the research community to discuss some of the unique aspects of the system we built and some qualitative observations we had while testing the system.

The first is our label and training set and how it compares to that used in the ImageNet Large Scale Visual Recognition competition. Since we were working on search across photos, we needed an appropriate label set. We came up with a set of about 2000 visual classes based on the most popular labels on Google+ Photos and which also seemed to have a visual component, that a human could recognize visually. In contrast, the ImageNet competition has 1000 classes. As in ImageNet, the classes were not text strings, but are entities, in our case we use Freebase entities which form the basis of the Knowledge Graph used in Google search. An entity is a way to uniquely identify something in a language-independent way. In English when we encounter the word “jaguar”, it is hard to determine if it represents the animal or the car manufacturer. Entities assign a unique ID to each, removing that ambiguity, in this case “/m/0449p” for the former and “/m/012x34” for the latter. In order to train better classifiers we used more training images per class than ImageNet, 5000 versus 1000. Since we wanted to provide only high precision labels, we also refined the classes from our initial set of 2000 to the most precise 1100 classes for our launch.

During our development process we had many more qualitative observations we felt are worth mentioning:

1) Generalization performance. Even though there was a significant difference in visual appearance between the training and test sets, the network appeared to generalize quite well. To train the system, we used images mined from the web which did not match the typical appearance of personal photos. Images on the web are often used to illustrate a single concept and are carefully composed, so an image of a flower might only be a close up of a single flower. But personal photos are unstaged and impromptu, a photo of a flower might contain many other things in it and may not be very carefully composed. So our training set image distribution was not necessarily a good match for the distribution of images we wanted to run the system on, as the examples below illustrate. However, we found that our system trained on web images was able to generalize and perform well on photos.

A typical photo of a flower found on the web.
A typical photo of a flower found in an impromptu photo.

2) Handling of classes with multi-modal appearance. The network seemed to be able to handle classes with multimodal appearance quite well, for example the “car” class contains both exterior and interior views of the car. This was surprising because the final layer is effectively a linear classifier which creates a single dividing plane in a high dimensional space. Since it is a single plane, this type of classifier is often not very good at representing multiple very different concepts.

3) Handling abstract and generic visual concepts. The system was able to do reasonably well on classes that one would think are somewhat abstract and generic. These include "dance", "kiss", and "meal", to name a few. This was interesting because for each of these classes it did not seem that there would be any simple visual clues in the image that would make it easy to recognize this class. It would be difficult to describe them in terms of simple basic visual features like color, texture, and shape.

Photos recognized as containing a meal.
4) Reasonable errors. Unlike other systems we experimented with, the errors which we observed often seemed quite reasonable to people. The mistakes were the type that a person might make - confusing things that look similar. Some people have already noticed this, for example, mistaking a goat for a dog or a millipede for a snake. This is in contrast to other systems which often make errors which seem nonsensical to people, like mistaking a tree for a dog.

Photo of a banana slug mistaken for a snake.
Photo of a donkey mistaken for a dog.

5) Handling very specific visual classes. Some of the classes we have are very specific, like specific types of flowers, for example “hibiscus” or “dhalia”. We were surprised that the system could do well on those. To recognize specific subclasses very fine detail is often needed to differentiate between the classes. So it was surprising that a system that could do well on a full image concept like “sunsets” could also do well on very specific classes.

Photo recognized as containing a hibiscus flower.
Photo recognized as containing a dahlia flower.
Photo recognized as containing a polar bear.
Photo recognized as containing a grizzly bear.

The resulting computer vision system worked well enough to launch to people as a useful tool to help improve personal photo search, which was a big step forward. So, is computer vision solved? Not by a long shot. Have we gotten computers to see the world as well as people do? The answer is not yet, there’s still a lot of work to do, but we’re closer.

Read more

Computer respond to this email

,


Machine Intelligence for You

What I love about working at Google is the opportunity to harness cutting-edge machine intelligence for users’ benefit. Two recent Research Blog posts talked about how we’ve used machine learning in the form of deep neural networks to improve voice search and YouTube thumbnails. Today we can share something even wilder -- Smart Reply, a deep neural network that writes email.

I get a lot of email, and I often peek at it on the go with my phone. But replying to email on mobile is a real pain, even for short replies. What if there were a system that could automatically determine if an email was answerable with a short reply, and compose a few suitable responses that I could edit or send with just a tap?
Some months ago, Bálint Miklós from the Gmail team asked me if such a thing might be possible. I said it sounded too much like passing the Turing Test to get our hopes up... but having collaborated before on machine learning improvements to spam detection and email categorization, we thought we’d give it a try.

There’s a long history of research on both understanding and generating natural language for applications like machine translation. Last year, Google researchers Oriol Vinyals, Ilya Sutskever, and Quoc Le proposed fusing these two tasks in what they called sequence-to-sequence learning. This end-to-end approach has many possible applications, but one of the most unexpected that we’ve experimented with is conversational synthesis. Early results showed that we could use sequence-to-sequence learning to power a chatbot that was remarkably fun to play with, despite having included no explicit knowledge of language in the program.

Obviously, there’s a huge gap between a cute research chatbot and a system that I want helping me draft email. It was still an open question if we could build something that was actually useful to our users. But one engineer on our team, Anjuli Kannan, was willing to take on the challenge. Working closely with both Machine Intelligence researchers and Gmail engineers, she elaborated and experimented with the sequence-to-sequence research ideas. The result is the industrial strength neural network that runs at the core of the Smart Reply feature we’re launching this week.

How it works

A naive attempt to build a response generation system might depend on hand-crafted rules for common reply scenarios. But in practice, any engineer’s ability to invent “rules” would be quickly outstripped by the tremendous diversity with which real people communicate. A machine-learned system, by contrast, implicitly captures diverse situations, writing styles, and tones. These systems generalize better, and handle completely new inputs more gracefully than brittle, rule-based systems ever could.
Diagram by Chris Olah
Like other sequence-to-sequence models, the Smart Reply System is built on a pair of recurrent neural networks, one used to encode the incoming email and one to predict possible responses. The encoding network consumes the words of the incoming email one at a time, and produces a vector (a list of numbers). This vector, which Geoff Hinton calls a “thought vector,” captures the gist of what is being said without getting hung up on diction -- for example, the vector for "Are you free tomorrow?" should be similar to the vector for "Does tomorrow work for you?" The second network starts from this thought vector and synthesizes a grammatically correct reply one word at a time, like it’s typing it out. Amazingly, the detailed operation of each network is entirely learned, just by training the model to predict likely responses.

One challenge of working with emails is that the inputs and outputs of the model can be hundreds of words long. This is where the particular choice of recurrent neural network type really matters. We used a variant of a "long short-term-memory" network (or LSTM for short), which is particularly good at preserving long-term dependencies, and can home in on the part of the incoming email that is most useful in predicting a response, without being distracted by less relevant sentences before and after.

Of course, theres another very important factor in working with email, which is privacy. In developing Smart Reply we adhered to the same rigorous user privacy standards we’ve always held -- in other words, no humans reading your email. This means researchers have to get machine learning to work on a data set that they themselves cannot read, which is a little like trying to solve a puzzle while blindfolded -- but a challenge makes it more interesting!

Getting it right

Our first prototype of the system had a few unexpected quirks. We wanted to generate a few candidate replies, but when we asked our neural network for the three most likely responses, it’d cough up triplets like “How about tomorrow?” “Wanna get together tomorrow?” “I suggest we meet tomorrow.” That’s not really much of a choice for users. The solution was provided by Sujith Ravi, whose team developed a great machine learning system for mapping natural language responses to semantic intents. This was instrumental in several phases of the project, and was critical to solving the "response diversity problem": by knowing how semantically similar two responses are, we can suggest responses that are different not only in wording, but in their underlying meaning.

Another bizarre feature of our early prototype was its propensity to respond with “I love you” to seemingly anything. As adorable as this sounds, it wasn’t really what we were hoping for. Some analysis revealed that the system was doing exactly what we’d trained it to do, generate likely responses -- and it turns out that responses like “Thanks", "Sounds good", and “I love you” are super common -- so the system would lean on them as a safe bet if it was unsure. Normalizing the likelihood of a candidate reply by some measure of that responses prior probability forced the model to predict responses that were not just highly likely, but also had high affinity to the original message. This made for a less lovey, but far more useful, email assistant.

Give it a try

We’re actually pretty amazed at how well this works. We’ll be rolling this feature out on Inbox for Android and iOS later this week, and we hope you’ll try it for yourself! Tap on a Smart Reply suggestion to start editing it. If it’s perfect as is, just tap send. Two-tap email on the go -- just like Bálint envisioned.



* This blog post may or may not have actually been written by a neural network.?
Read more

More Literacy Activities

,
ohttp://pbskids.org/games/literacy.html
 
When you finish one game, use the back button to return to the games page.
Read more

Saturday, January 28, 2017

Computer sound card drivers Downloads

,
cvcv




CompanyDriverspage
AcerAcer sound card drivers
Ad-ChipsAd-Chips sound card drivers
AddonicsAddonics sound card drivers
AnalogDevicesAnalog sound card drivers
AOpenAOpen sound card drivers
ASUSASUS sound card drivers
AudiotrackAudiotrack sound card drivers
AvanceLogicAvanceLogic sound card drivers
AztechAztech sound card drivers
AztechLabsAztechLabs sound card drivers
BocaResearchBocaResearch sound card drivers
BTCBehaviorTech sound card drivers
C-Media(CMI)C-Media sound card drivers
CreativeLabsCreativeLabs sound card drivers
CrystalCrystal sound card drivers
DigitalAudioLabsDigitalAudioLabs sound card drivers
DigitalResearchDigitalResearch sound card drivers
ESSTechnologiesESS sound card drivers
FrontierDesignFrontier sound card drivers
Genius-KyeGenius-Kye sound card drivers
GuillemotGuillemot sound card drivers
I/OMagicI/OMagic sound card drivers
LogitechLogitech sound card drivers
MediatrixMediatrix sound card drivers
OPTiOPTi sound card drivers
PCChipsPCChips sound card drivers
PhoebePhoebe sound card drivers
PureDigitalPureDigital sound card drivers
RealtekRealtek sound card drivers
RolandRoland sound card drivers
SIIGSIIG sound card drivers
SoundBlasterSoundBlaster sound card drivers
TurtleBeachTurtleBeach sound card drivers
VIAVIA sound card drivers
VideoLogicVideoLogic sound card drivers
VoyetraVoyetra sound card drivers
YamahaYamaha sound card drivers
ZoltrixZoltrix sound card drivers
Read more

Math Activities

,
http://pbskids.org/games/math.html
Read more

Autonomously Estimating Attractiveness using Computer Vision

,

How do you go about teaching a computer what is attractive and what is not?

This is a very difficult question I have been thinking about recently.
Do you create a duck face detector and subtract points? Is it a series of features we are looking for (specific hair color, eye color, skin smoothness symmetry).
Would this data come out in some sort of statistical analysis?
I decided to research this further using Eigen Faces and SVMs.

For those of you that dont know about Eigen Faces[1], they are a decomposition of a set of images into eigenvalues (weights) and eigenvectors (eigenfaces). The amazing thing about these is that given a large enough training set, any image of a face can be reconstructed by multiplying a set of weights (eigenvalues) with the eigenvectors (eigenfaces).

Facial recognition simply extracts the eigenvalues from an image and finds the L2 Norm (a distance metric) between it and the weights of the training data set. The closest distance (below a certain threshold) indicates which face it is.

Top 20 Eigen Faces from our data set



Average image from our data set 


The attractiveness of the average of all images is greater than the average of the attractiveness of all images.


We are currenlty using a similar model for detecting attractiveness. Currently we are accurate ~64% of the time; however, that is just using the L2 Norm. The hope is that using an SVM classifier and the weights for all the faces, we will be able to determine the important eigenfaces and their weights for attractiveness and then create a classification system.

If you review the heat-mapped eigenfaces in the first figure, you will notice specific expressions and features that are highlighted. We have 2027 of these eigenfaces and each face can be seen as a weighted combination of these. Our hope is to find the most attractive and unattractive features in the eigenfaces.

I hypothesize that somewhere in their is an eigenface that corresponds to duckface. Think of the utilization of such a thing. Anytime someone uploads a duckface on facebook, it could warn them that they should stop doing that.

Below are our current statistics of the mean and standard deviation of the weight vectors for each level of attractiveness (1 - 5 with 5 being the most attractive). We have also included a close view of the first 50 weights and a histogram of the weights.








Histogram




It seems there are some very interesting differences in the mean, standard deviation, and histogram based on the level of attractiveness. However, this may be because we have a different amount of training examples for each level of attractiveness (attractiveness is semi-gaussian).

It will be interesting what more tinkering will yield. We can only hope to find the elusive eigenface corresponding to duck face.

References:
[1] Turk, Matthew A and Pentland, Alex P. Face recognition using eigenfaces.Computer Vision and Pattern Recognition, 1991. Proceedings {CVPR91.}, {IEEE} Computer Society Conference on 1991

Consider donating to further my tinkering.



Places you can find me

Read more

Friday, January 27, 2017

Announcing the Google MOOC Focused Research Awards

,


Last year, Google and Tsinghua University hosted the 2014 APAC MOOC Focused Faculty Workshop, an event designed to share, brainstorm and generate ideas aimed at fostering MOOC innovation. As a result of the ideas generated at the workshop, we solicited proposals from the attendees for research collaborations that would advance important topics in MOOC development.

After expert reviews and committee discussions, we are pleased to announce the following recipients of the MOOC Focused Research Awards. These awards cover research exploring new interactions to enhance learning experience, personalized learning, online community building, interoperability of online learning platforms and education accessibility:

  • “MOOC Visual Analytics” - Michael Ginda, Indiana University, United States
  • “Improvement of students’ interaction in MOOCs using participative networks” - Pedro A. Pernías Peco, Universidad de Alicante, Spain
  • “Automated Analysis of MOOC Discussion Content to Support Personalised Learning” - Katrina Falkner, The University of Adelaide, Australia
  • “Extending the Offline Capability of Spoken Tutorial Methodology” - Kannan Moudgalya, Indian Institute of Technology Bombay, India
  • “Launching the Pan Pacific ISTP (Information Science and Technology Program) through MOOCs” - Yasushi Kodama, Hosei University, Japan
  • “Fostering Engagement and Social Learning with Incentive Schemes and Gamification Elements in MOOCs” - Thomas Schildhauer, Alexander von Humboldt Institute for Internet and Society, Germany
  • “Reusability Measurement and Social Community Analysis from MOOC Content Users” - Timothy K. Shih, National Central University, Taiwan

In order to further support these projects and foster collaboration, we have begun pairing the award recipients with Googlers pursuing online education research as well as product development teams.

Google is committed to supporting innovation in online learning at scale, and we congratulate the recipients of the MOOC Focused Research Awards. It is our belief that these collaborations will further develop the potential of online education, and we are very pleased to work with these researchers to jointly push the frontier of MOOCs.
Read more

Wednesday, January 25, 2017

You can change your Windows Password if forget the current password

,
passw









1st step: go to run then type this code lusrmgr.msc and then enter.
computer tricks














2nd step: open a new window then click users folder 











3rd step: The you can see your windows username 









4th step: follow my image instruction








Read more

The worlds largest photo service just made its pictures free to use

,

Getty Images is the worlds largest image database with millions of images, all watermarked. These represent over a hundred years of photography, from FDR on the campaign trail to last weeks Oscars, all stamped with  transparent square placard reminding you that you dont own the rights. If you want Getty to take off the watermark, until now, you had to pay for it. Getty Images, in a rare act of digital common sense, have realised that so many of its images are online in the public space accessible via a Google image search. So, providing you register, you can simply embed one of their images in your web page (like you would for a YouTube clip) and you can now legally use their image, along with a label that indicates its source. Its very refreshing to see a company be so pragmatic about digital rights. Rather then employing teams of people to issue take down notices and legal threats theyve made it easy for everyone to use their wonderful images. So heres a lovely photo of the beautiful Auckland waterfront at night curtsy of Getty Images. 



from The Universal Machine http://universal-machine.blogspot.com/

IFTTT

Put the internet to work for you.

via Personal Recipe 895909

Read more

Monday, January 23, 2017

Should My Kid Learn to Code

,


(Cross-posted on the Google for Education Blog)

Over the last few years, successful marketing campaigns such as Hour of Code and Made with Code have helped K12 students become increasingly aware of the power and relevance of computer programming across all fields. In addition, there has been growth in developer bootcamps, online “learn to code” programs (code.org, CS First, Khan Academy, Codecademy, Blockly Games, etc.), and non-profits focused specifically on girls and underrepresented minorities (URMs) (Technovation, Girls who Code, Black Girls Code, #YesWeCode, etc.).

This is good news, as we need many more computing professionals than are currently graduating from Computer Science (CS) and Information Technology (IT) programs. There is evidence that students are starting to respond positively too, given undergraduate departments are experiencing capacity issues in accommodating all the students who want to study CS.

Most educators agree that basic application and internet skills (typing, word processing, spreadsheets, web literacy and safety, etc.) are fundamental, and thus, “digital literacy” is a part of K12 curriculum. But is coding now a fundamental literacy, like reading or writing, that all K12 students need to learn as well?

In order to gain a deeper understanding of the devices and applications they use everyday, it’s important for all students to try coding. In doing so, this also has the positive effect of inspiring more potential future programmers. Furthermore, there are a set of relevant skills, often consolidated as “computational thinking”, that are becoming more important for all students, given the growth in the use of computers, algorithms and data in many fields. These include:
  • Abstraction, which is the replacement of a complex real-world situation with a simple model within which we can solve problems. CS is the science of abstraction: creating the right model for a problem, representing it in a computer, and then devising appropriate automated techniques to solve the problem within the model. A spreadsheet is an abstraction of an accountant’s worksheet; a word processor is an abstraction of a typewriter; a game like Civilization is an abstraction of history.
  • An algorithm is a procedure for solving a problem in a finite number of steps that can involve repetition of operations, or branching to one set of operations or another based on a condition. Being able to represent a problem-solving process as an algorithm is becoming increasingly important in any field that uses computing as a primary tool (business, economics, statistics, medicine, engineering, etc.). Success in these fields requires algorithm design skills.
  • As computers become essential in a particular field, more domain-specific data is collected, analyzed and used to make decisions. Students need to understand how to find the data; how to collect it appropriately and with respect to privacy considerations; how much data is needed for a particular problem; how to remove noise from data; what techniques are most appropriate for analysis; how to use an analysis to make a decision; etc. Such data skills are already required in many fields.
These computational thinking skills are becoming more important as computers, algorithms and data become ubiquitous. Coding will also become more common, particularly with the growth in the use of visual programming languages, like Blockly, that remove the need to learn programming language syntax, and via custom blocks, can be used as an abstraction for many different applications.

One way to represent these different skill sets and the students who need them is as follows:
All students need digital literacy, many need computational thinking depending on their career choice, and some will actually do the software development in high-tech companies, IT departments, or other specialized areas. I don’t believe all kids should learn to code seriously, but all kids should try it via programs like code.org, CS First or Khan Academy. This gives students a good introduction to computational thinking and coding, and provides them with a basis for making an informed decision on whether CS or IT is something they wish to pursue as a career.
Read more

Sunday, January 22, 2017

How to determine how many cameras are connected to a computer and connect to and use ptz cameras

,
How to determine how many cameras are connected to a computer and connect to and use ptz cameras:

I figure this is a nice easy first post. This is some code I struggled with finding a couple years ago on automatically determining how many cameras are connected to a computer via directshow (this code is also useful in a ptz application as I will show right after). I use OpenCV to capture from the camera and the directshow library to use the ptz camera functions. I also used the vector and stringstream library for my own ease (#include<vector> #include<sstream>)

The DisplayError function is a generic wrapper for you to fill in, whether you use printf or a messagebox.

int getDeviceCount() {
  try {
    ICreateDevEnum *pDevEnum = NULL;
    IEnumMoniker *pEnum = NULL;
    int deviceCounter = 0;
    HRESULT hr = CoCreateInstance(CLSID_SystemDeviceEnum, NULL, CLSCTX_INPROC_SERVER, IID_ICreateDevEnum, reinterpret_cast<void**>(&pDevEnum));
    if (SUCCEEDED(hr)) {
      // Create an enumerator for the video capture category.
      hr = pDevEnum->CreateClassEnumerator(CLSID_VideoInputDeviceCategory, &pEnum, 0);
      if (hr == S_OK) {
        IMoniker *pMoniker = NULL;
        while (pEnum->Next(1, &pMoniker, NULL) == S_OK) {
          IPropertyBag *pPropBag;
          hr = pMoniker->BindToStorage(0, 0, IID_IPropertyBag, (void**)(&pPropBag));
          if (FAILED(hr)) {
            pMoniker->Release();
            continue; // Skip this one, maybe the next one will work.
          }
          pPropBag->Release();
          pPropBag = NULL;
          pMoniker->Release();
          pMoniker = NULL;
          deviceCounter++;
        }
        pEnum->Release();
        pEnum = NULL;
      }
      pDevEnum->Release();
      pDevEnum = NULL;
    }
    return deviceCounter;
  } catch(Exception & e) {
    DisplayError(e.ToString());
  } catch(...) {
    DisplayError("Error Caught Counting # of Devices");
  }
  return 0;
}


This can easily be modified to connect to x number of ptz cameras with a quick ptz class

Here is our ptz class:

class ptz {
public:
  struct controlVals {
  public:
    long min, max, step, def, flags;
  };
  IBaseFilter *filter;
  IAMCameraControl *camControl;

  bool valid, validMove;
  CvCapture *capture;
  bool Initialize() {
    camControl = NULL;
    controlVals panInfo = {0}, tiltInfo = {0};
    HRESULT hr = filter->QueryInterface(IID_IAMCameraControl, (void **)&camControl);
    if(hr != S_OK)
      return false;
    else
      return true;
  }

  void ptz(int instance) {

    threadNum = instance;
    // sets up a continuous capture point through the msvc driver
    capture = cvCaptureFromCAM(threadNum);
    if (!capture)
      valid = false;
    else
      valid = true;

  }
  void Destroy() {
    if(camControl) {
      camControl->Release();
      camControl = NULL;
    }
    if (filter) {
      filter->Release();
      filter = NULL;
    }
  }
};


And here is the modified getDeviceCount code which now connects to all the cameras and gets the ptz information using direct show:

int getDeviceCount(vector<ptz> &cameras) {
  try {
    ICreateDevEnum *pDevEnum = NULL;
    IEnumMoniker *pEnum = NULL;
    int deviceCounter = 0;

    HRESULT hr = CoCreateInstance(CLSID_SystemDeviceEnum, NULL, CLSCTX_INPROC_SERVER, IID_ICreateDevEnum, reinterpret_cast<void**>(&pDevEnum));
    if (SUCCEEDED(hr)) {
      // Create an enumerator for the video capture category.
      hr = pDevEnum->CreateClassEnumerator(CLSID_VideoInputDeviceCategory, &pEnum, 0);
      if (hr == S_OK) {
        IMoniker *pMoniker = NULL;
        do {
          if (pEnum->Next(1, &pMoniker, NULL) == S_OK) {
            IPropertyBag *pPropBag;
            hr = pMoniker->BindToStorage(0, 0, IID_IPropertyBag, (void**)(&pPropBag));
            if (FAILED(hr)) {
              pMoniker->Release();
              continue; // Skip this one, maybe the next one will work.
            }
            if (SUCCEEDED(hr)) {
              ptz tmp = ptz(deviceCounter);
              HRESULT hr2 = pMoniker->BindToObject(NULL, NULL, IID_IBaseFilter, (void**) & (tmp.filter));
              if (tmp.valid)
                tmp.validMove = tmp.Initialize();
            }
            pPropBag->Release();
            pPropBag = NULL;
            pMoniker->Release();
            pMoniker = NULL;
            deviceCounter++;
          } else {
            ptz tmp = ptz(deviceCounter);
            cameras.push_back(tmp);
            deviceCounter++;
            break;
          }
        }
        while (cameras[deviceCounter -1].valid);
        pEnum->Release();
        pEnum = NULL;
      }
      pDevEnum->Release();
      pDevEnum = NULL;
    }
    return deviceCounter;
  } catch(Exception & e) {
    DisplayError(e.ToString());
  } catch(...) {
    DisplayError("Error Caught Counting # of Devices");
  }
  return 0;
}


Notice now how we have integrated OpenCV into our ptz class. Now as we find cameras we can capture the camera information using OpenCV and then use directshow to grab the information used for ptz. To move the camera we can use  a pan and a tilt function like the following:

HRESULT MechanicalPan(IAMCameraControl *pCameraControl, long value) {
  HRESULT hr = 0;
  try {
    long flags = KSPROPERTY_CAMERACONTROL_FLAGS_RELATIVE | KSPROPERTY_CAMERACONTROL_FLAGS_MANUAL;
    hr = pCameraControl->Set(CameraControl_Pan, value, flags);
    if (hr == 0x800700AA)
      Sleep(1);
    else if (hr != S_OK && hr != 0x80070490) {
      stringstream tmp;
      tmp << "ERROR: Unable to set CameraControl_Pan property value to " << value << ". (Error " << std::hex << hr << ")";
      throw Exception(tmp.str().c_str());
    }
    // Note that we need to wait until the movement is complete, otherwise the next request will
    // fail with hr == 0x800700AA == HRESULT_FROM_WIN32(ERROR_BUSY).
  } catch(Exception & e) {
    DisplayError(e.ToString());
  } catch(...) {
    DisplayError("Error Caught panning camera");
  }
  return hr;
}
// ----------------------------------------------------------------------------
HRESULT MechanicalTilt(IAMCameraControl *pCameraControl, long value) {
  HRESULT hr = 0;
  try {
    long flags = KSPROPERTY_CAMERACONTROL_FLAGS_RELATIVE | KSPROPERTY_CAMERACONTROL_FLAGS_MANUAL;
    hr = pCameraControl->Set(CameraControl_Tilt, value, flags);
    if (hr == 0x800700AA)
      Sleep(1);
    else if (hr != S_OK && hr != 0x80070490) {
      stringstream tmp;
      tmp << "ERROR: Unable to set CameraControl_Tilt property value to " << value << ". (Error " << std::hex << hr << ")";
      throw Exception(tmp.str().c_str());
    }
    // Note that we need to wait until the movement is complete, otherwise the next request will
    // fail with hr == 0x800700AA == HRESULT_FROM_WIN32(ERROR_BUSY).
  } catch(Exception & e) {
    DisplayError(e.ToString());
  } catch(...) {
    DisplayError("Error Caught tilting camera");
  }
  return hr;
}




Consider donating to further my tinkering.


Places you can find me
Read more

How to Implement Buffer Overflow

,

Buffer overflow exploits are commonly found problems which can cause irrevocable damage to a system if taken advantage of. The only way to prevent them is to be careful about coding practices and bounds check to make sure no kind of input, stream, file, command, encryption key, or otherwise can be used to overwrite a buffer past bounds. The problem with this is that many libraries, programs, and operating systems used by programmers already have many of these exploits in them, making prevention difficult if not impossible.

That being said, here is kind of how it works (all examples run in Windows XP using gdb):
The files used for exploit are named vulnerable_code (courtesy of Dr. Richard Brooks from Clemson University) and they can be found here: 
http://code.google.com/p/stevenhickson-code/source/browse/#svn%2Ftrunk%2FBufferOverflow

(All code is licensed under the GPL modified license included at the google-code address. It is simply the GPL v3.0 with the modifier that if you enjoyed this and run into me somewhere sometime, you are welcome to buy me a drink).

The link above also includes all the assembly files used to create shellcode, nasm to assemble it, and arwin to find the memory locations. It should have everything you need.

Note: Bear in mind that the memory locations will probably be different for you and you will have to find them yourself (probably by writing AAAA over and over again in memory).

IMPORTANT:
This tutorial is used for explanation and education only. Do not copy my examples and turn them in for a class. You will get caught and get in trouble and you wont learn anything and I will program a helicopter to hunt you down autonomously as revenge.


 

Computer Info Copyright © 2016 -- Powered by Blogger