Speech Recognition Software

frogamus

New Member
Messages
23
Reaction score
0
Points
1
I broke out an old copy of Dragon NaturallySpeaking yesterday and installed it on my XP system. It's about 10 years old. I stopped using it shortly after I bought it because I never had any real use for it. They aren't really much use for writing programs. But since I have been trying to develop a website, I thought I would pick it up and give it a whirl. It seems to work fairly well but there is a lot of training.

I was wondering what kind of success others have had and if anyone had any insights on newer speech recognition programs.

Thanks.
 

merrillmck

New Member
Messages
134
Reaction score
0
Points
0
What is Dragon Naturally Speaking? My guess is you train it to recognize your voice and eventually you can "write" a paper using only your voice ... at least that's the goal.

So how does that tie into your website? Do you want your website to have that type of functionality?

And sorry, I don't have any advice/experience to offer. I've written computer vision programs (computers attempt to recognize what is in images) so I'm interested in this topic as it is very related.
 

frogamus

New Member
Messages
23
Reaction score
0
Points
1
What is Dragon Naturally Speaking? My guess is you train it to recognize your voice and eventually you can "write" a paper using only your voice ... at least that's the goal.

Yes. You Talk. It Types what ever you say. Setting up a new user you read a small article so it can learn how you speak. After setup when it makes a mistake you correct it. The more you use it the better it gets. You use different voice commands to perform different tasks. You can navigate between different programs, through menus or click buttons.

So how does that tie into your website? Do you want your website to have that type of functionality?

I was considering using it to make it easier to write with. It doesn't make spelling errors and I find it hard to keep my train of thought going. I never used it before because I mostly did programming as a hobby. It's not very good for writing code.
 

zen-r

Active Member
Messages
1,937
Reaction score
3
Points
38
Lol. I think DragonNS was the last time I dabbled with voice recognition software as well. Yes, it must have been about 10 years ago.

Perhaps it scarred both of us so badly, that it put us both off for life! ;)

I remember having high hopes for it when I first tried it out, only to be disappointed to find that there were so many errors which needed editing, that it was quicker to type it in the first place. I also remember that as a Brit, I found that I got best results by putting on an American accent!

I can't remember if it was only for dictating text, though, or if it could also be used for controlling other software.

The only voice recognition I use at the moment is on 2 non-PC devices.

In my mobile phone/PDA/satnav/camera (all-in-one)
: I installed MS Voice Command & it has proven a great enhancement, because it integrates nicely with most programs on the device. For example, I can now tell it to play a specific music track, or whole album (without having to search for it myself) or get it to read me all songs by a certain artist. I can tell it to open any program, such as the web browser. I've assigned the Voice program to 1 hot key, so a quick press is all it takes before I can start talking to my device!

And I have set it so that when someone calls me, it automatically checks my address book & if they are in there, it tells me who is calling instead of just playing a ring tone. It doesn't read everything perfectly, but well enough for me to understand who it is.

In my TomTom GPS/satnav: the pre-installed voice software in there is great because it means I can tell the device about where I want to go, even while I'm driving (thus keeping it legal). And, of course, it can read out directions to me & even tell me which streets to look out for as I'm driving along. This is far better than my last device which just said, for example, to "take the next left" - regularly leaving me unsure about which "left" it meant, & having to glance dangerously at the screen.


So voice recognition still has lots of very worthwhile uses - though personally I have yet to need any on my PC!


Please click my Reputation button
reputation.gif
(at the corner of this post) & make me :) -it costs you nothing!

If I've traded services/credits with you, please remember to leave iTrader Feedback. Thanks.​
 

frogamus

New Member
Messages
23
Reaction score
0
Points
1
I remember having high hopes for it when I first tried it out, only to be disappointed to find that there were so many errors which needed editing, that it was quicker to type it in the first place. I also remember that as a Brit, I found that I got best results by putting on an American accent!

I sometimes wonder if I may have some of the same problem. I’ve been told I’m a little on the Texan side myself.


I can't remember if it was only for dictating text, though, or if it could also be used for controlling other software.

I know you can launch programs and I have been able to use it in all the programs I tried. Recent versions are available and I’m sure they must have made some progress in the past 10 years.

Edit:
A little more information on the subject.

I have Windows Vista Home Premium on my desktop and it as far as I know is part of Vista. It refers to itself as Windows Speech Recognition in the tutorial. I worked partly through the tutorial but I really haven’t tried it outside of that.

It is listed as Speech Recognition in the Control Panel. I couldn’t get to start up without a microphone hooked up. Maybe someone knows if it is worth the time and effort.

By the way zen-r, it has a recognition engine of the US and the UK.
 
Last edited:

sangeetjass

New Member
Messages
1
Reaction score
0
Points
0
Do you think voice recognition softwares can take over the Medical Transcription industry one day....I think if this happens it will be the first win of Artifician Intelligence over Human being taking our jobs away.
 

Livewire

Abuse Compliance Officer
Staff member
Messages
18,169
Reaction score
216
Points
63
Do you think voice recognition softwares can take over the Medical Transcription industry one day....I think if this happens it will be the first win of Artifician Intelligence over Human being taking our jobs away.

I have my doubts; there's too many variations/accents on something like English to code in, let alone all the varying dialects of Spanish, French, German, Japanese, etc.

While it might get popular in one area, it won't experience success in another (due to the lack of being able to recognize a particular accent without -some- additional coding - I know it can "learn," but from my own experience it frequently seems to un-learn what it did learn because of accent/dialect issues), until it gets some additional programming behind it.


Guess what I'm saying is it's still way easier to teach someone to type 120WPM and have them transcribe it instead of a computer program, especially if the doctor/scientist/researcher has to keep going back to correct the typos the program is making.
 

Sharky

Community Paragon
Community Support
Messages
4,399
Reaction score
95
Points
48
With regards to medical transcription -- I've heard the voice recordings: it's hard enough understanding what they're saying for a person, a machine simply wouldn't cope.

However -- I've had fairly positive experiences with Microsoft's own voice recognition software in Vista. Leaps and bounds ahead of XP's one. Only problem is that (with my set up at least -- desktop microphone instead of headset one) it still can't cope with a washing machine in the background, for example. With the noise reduction software that came with my sound card, it can almost cope with music, but still not good enough for proper production use. Nor is it exceptionally good at figuring out which is my voice, and when I'm trying to dictate, vs when I'm trying to have a conversation that doesn't need transcribing.

And by far the biggest issue, you feel like a bit of a numpty talking to your PC. Especially when it doesn't reply. Almost like talking to a brick wall.
 

frogamus

New Member
Messages
23
Reaction score
0
Points
1
However -- I've had fairly positive experiences with Microsoft's own voice recognition software in Vista. Leaps and bounds ahead of XP's one.

The version I have of XP(Home Edition) didn’t come with any speech recognition software only Text To Speech.

And by far the biggest issue, you feel like a bit of a numpty talking to your PC. Especially when it doesn't reply. Almost like talking to a brick wall.

That will take some getting used to. The program I’m using right now(Dragon NaturallySpeaking) will work with either speech or recordings so there is the option of using a dictation machine or even a mp3 player for input. I may need to set up a separate user for recordings though.

Edit:
Do you think voice recognition softwares can take over the Medical Transcription industry one day....I think if this happens it will be the first win of Artifician Intelligence over Human being taking our jobs away.

I personally don’t think it will happen any time soon . You can train them to suit your own vocabulary, but medical records are to important to be making mistakes. It’s hard enough just being human for even the medical people to get it right all the time. It could mean the difference between life and death if you make a mistake in someone’s medical record.
 
Last edited:

merrillmck

New Member
Messages
134
Reaction score
0
Points
0
I have my doubts; there's too many variations/accents on something like English to code in, let alone all the varying dialects of Spanish, French, German, Japanese, etc.

It, like image recognition (just voice recognition in 2+ dimensions) and action recognition (3+ dimensions), is a very difficult problem.

While better algorithms and faster hardware will always make these technologies improve, there are some who believe there is a ceiling that can never be passed. As a metaphor, humans often have to ask someone else to repeat themselves (voice). We often need to squint or move closer or use context clues to recognize something visually --- and then we're often taking an educated guess.

In my opinion, recognition technologies will be used in specific situations: those situations where errors are tolerated; situations where a computer's great memory can be leveraged (humans have trouble memorizing a database); controlled situations (limiting the problem); and situations where using people is more expensive than using a computer or a computer's assistance.
 

zen-r

Active Member
Messages
1,937
Reaction score
3
Points
38
Yes, I think we are a long way off from anywhere near perfect recognition technology.

Image recognition is particularly hard but, as you said merrillmck, specific situations are more achievable. Researchers are making good in-roads into face recognition technology, especially in areas where it can be used to identify people (eg football hooligans!) from CCTV footage. Off course, it still requires a good frontal image, & can only work within a very limited database of known offenders. But it is a start, & who knows how far it will "progress" (rather worrying, if you aren't into the "Big Brother is always watching you" scenario).

Certainly, image recognition is already being applied on a daily basis now here in the UK, to identify cars based on their licence plates. We have road cameras everywhere, & the identification of cars is used in everything from measuring your car's average speed over a long stretch of road (cameras at start & end points), to instantly alerting roadside police cars if a vehicle passing them hasn't got valid car tax, insurance or there are other offences against them. It's getting too hard to cheat!

I wonder how many years before Google attempts an image search facility. It would be pretty useful, though very hard to achieve. For example, I upload a picture of John Smith, & Google returns any websites that contain a similar image of him.

Far more do-able, & a facility I looked for on the web a few weeks ago without success, would be a search engine that finds an exact match for your image. So, for example, if I have a drawing that I downloaded from somewhere but I can't remember where, I upload it to the search engine & it finds me sites with identical images on them.

merrillmck, if there isn't a site out there which I have missed & already does this, perhaps you could get on & perfect this technology. You could be the founder of the next big search engine, & make Google look like small-potatoes! ;)
 
Last edited:

jmcgowan

Member
Messages
134
Reaction score
1
Points
18
I heard a rumor once of a web site where you can record yourself humming a part of a song and it will attempt to name the song and artist. If it doesn't exist yet it would certainly be a nice website to own...
 

merrillmck

New Member
Messages
134
Reaction score
0
Points
0
I wonder how many years before Google attempts an image search facility.

They are knee deep in the latest image and video recognition algorithms already. Currently, I would guess that their image search relies primarily on tags (direct and indirectly). Another likely input is what user's click on after a particular search. Image and video recognition are probably included in the search algorithm, but given an extremely small weight due to the lack of reliability.

Does Google have a music/sound search?
 
Last edited:

zen-r

Active Member
Messages
1,937
Reaction score
3
Points
38
They are knee deep in the latest image and video recognition algorithms already. Currently, I would guess that their image search relies primarily on tags (direct and indirectly). Another likely input is what user's click on after a particular search. Image and video recognition are probably included in the search algorithm, but given an extremely small weight due to the lack of reliability.

Does Google have a music/sound search?

Which image search are you referring to? Is it in their Labs section?

If you are referring to their image search based on text entry by the user (rather than the user entering/uploading an image to search against) then I wasn't aware of Google using anything other than tags & text in their algorithm.

What you suggested is possible, though. Were you just speculating that they also use image recognition in the algorithm, or have you read that somewhere?

As for sound search, that's another thing I haven't seen yet, but I wouldn't be surprised if Google or someone else has a go at in the future. I expect that it will depend on the processor speeds improving & also on some significant advances being made in the software.
 

frogamus

New Member
Messages
23
Reaction score
0
Points
1
Far more do-able, & a facility I looked for on the web a few weeks ago without success, would be a search engine that finds an exact match for your image. So, for example, if I have a drawing that I downloaded from somewhere but I can't remember where, I upload it to the search engine & it finds me sites with identical images on them.

It is a wonder more people don't do that looking people using copyrighted material.
 

merrillmck

New Member
Messages
134
Reaction score
0
Points
0
Which image search are you referring to? Is it in their Labs section?

http://images.google.com/
http://video.google.com/
http://www.youtube.com/ (owned by Google)

I'm speculating how they use recognition algorithms to supplement their search algorithms. I know they hire phd's in computer vision (image and video recognition specialists).

Their internal research in this area may be just that - or it could be actively used today in their primary search algorithms. Obviously, only an insider would know.

As for speed, most of the work would be done offline. It would be too slow to wait until someone submits their search query. One way to do it would be to run the recognition algorithms on images found on the net and generate tags with confidences - a picture of a sailboat in the ocean might generate [boat, water, ocean, sky, sailboat, motorboat, vacation, blue, sea, river]. If there were no other clues from the internal file comments, website tags, a descriptive filename, and there was no search history then these auto-generated tags would be all you'd have to go on. The search rank (ie confidence) would be very low. However, if everyone started clicking on the image after doing sailboat searches then you'd adjust the search rank [and likely feed that back into your original recognition algorithm].

By the way, there are published papers on this "idea" in journals like the International Conference of Computer Vision (ICCV) and Computer Vision and Pattern Recognition (CVPR). This idea is not my own. :)
 
Last edited:

shervinemami

New Member
Messages
19
Reaction score
1
Points
0
zen-r: Google DOES definitely do research in Face Recognition and tracking in photos and videos. I know because earlier this year when I was working on a Face Recognition system for a mobile robot, I was about to implement a similar algorithm to the one that Google published in "http://research.google.com/pubs/pub34394.html"

Basically, Face Recognition and Face Tracking have developed a lot over the past decade because there is so much commercial & government interest in being able to automatically find criminals & terrorists in security videos, etc. And now there's a lot of new interest in automatically tagging pictures on the internet, which is what Google (and several other companies) are trying to do. There's already a number of websites that let you search images on the web (like Google Images), except that they actually use Image Recognition techniques to look at what is in the images instead of just their filenames or webpage content near the image (usually called Image Retrieval). And there's also websites that try to train their image recognition systems by asking people to tag their photos by hand while playing games. (Unfortunately I cant remember website addresses right now, but I can find some if you want).

Face Recognition can be tweaked these days to be good enough that some laptops can log you into your computer using the Face Recognition system, and even you can buy a lot of cheap digital cameras that do Face Detection or even Face Recognition in hardware that work relatively well. There are also a few different companies that will try to automatically tag you in your Facebook photos (I know because I created one of them :) But none of the Face Recognition systems today are actually very intelligent or dynamic enough to work well in many real-life situations, but they work well enough for some commercial applications, hence why it is a multi-billion dollar industry.

On the other hand, Speech Recognition hasn't improved nearly as much as people expected it would have by now, if you try using any of the free or commercial programs, you'll see that they can work very well if they have a very clear audio signal and an american speaker, talking slowly and using only the limited vocabulary, but once you try it with someone with a slight accent or talk at the speed of normal speech or have any sort of noise in the background or in the hardware, or you have a large vocabulary of possible words, then it becomes completely unusable! A few months ago we tried both of the leading speech recognition systems, and because we had a little bit of background fan noise (since it was on a robot!) and since we had Australian or Asian or European accents, it could only tell the difference between "Yes" and "No" correctly 50% of the time! In other words it might as well have guessed!

The funny thing is that Bill Gates himself keeps "predicting" that in the next 5 - 10years, that Speech Recognition will be the biggest revolution in computers. He's been saying that every year for over 12 years! "http://mpt.net.nz/archive/2005/12/30/gates"

So don't expect computers to be highly intelligent in any time soon. They will continue to get more intelligent and take over more & more of the basic jobs, but even though I'm a robotics and artificial intelligence developer, I definitely feel that computers will need decades before they are good at doing AI stuff nearly as well as humans or even animals do!

Cheers,
Shervin.
 

zen-r

Active Member
Messages
1,937
Reaction score
3
Points
38
It is a wonder more people don't do that looking people using copyrighted material.

I agree, & think that this will be among the first most easily achievable uses for an image recognition search engine.

As I said earlier, the search engine could simply look for an exact match for your image. This limited functionality wouldn't require the complex analysis & identification of objects in the image, as described in merrillmck's last post. All it would need to do is map the lines & nodes etc of the image, much like fingerprinting technology already does.

I imagine that this type of search could be a paid-for service, because it would be hard for companies like Google to keep it free by placing targeted ads, if the subject of the image couldn't be recognised by the software. As long as there aren't many competitors in this field of search, & as long as the search was useful to companies needing to track down copyright violations, there would presumably be some opportunity for charging real money!

@ shervinemami - if you remember any image recognition search engines, please post them here. It will be interesting to try them out.

Most stuff that I find is just discussion on what is being developed - no good working sites. The best I've found so far are these 2 sites ;

http://ideeinc.com - still in beta.
http://www.like.com - just a shopping product search.

I'm sure that there must be more, & better, already out there somewhere. :)
 
Last edited:

shervinemami

New Member
Messages
19
Reaction score
1
Points
0
Here are a few websites that let you play collaborative games, which are actually being used to train AI programs to do Image Recognition. The games are actualy very addictive somehow!

http://www.espgame.org/gwap/ (I think this was the original website using the idea)

http://www.peekaboom.org/ (For some reason, the website redirects you to the ESP Game if you dont click a link quickly!)

http://www.peekaboom.org/phetch/ (Another game by the same group)

Google seems to have bought the original ESP Game, and is using it to improve Google Images:
http://images.google.com/imagelabeler/

Here is a web search engine that lets you also search images using Face Detection. For example, you can tell it to search for "Paris", and then when it has pictures in France and pictures of Paris Hilton, you can select whether you want it to only show images of faces (using AI software to automatically detect a face within the image) or tell it to ignore faces, so it would show things like France.
http://www.exalead.com/search/image/

Google provides the same sort of feature but its hidden. You have to add "&imgtype=face" to the end of the URL. Here is a website that lets you do the Google Images Face search easily:
http://blogoscoped.com/archive/2007-05-28-n84.html

These 2 search engines only do Face Detection, where it is trying to find ANY face within an image, which is different to Face Recognition, where you KNOW you have a picture of a face, but you are trying to figure out WHO it is. They might sound similar, but they have completely different algorithms and problems to solve. When people talk about "Face Recognition", they usually actually mean "Face Detection"! But since when did mass media and common knowledge need to be based on true facts ;-)

There are a bunch of other websites besides these 2 that are also letting you do image recognition searches:

http://alipr.com/ (several types of object or face detection searches)
http://www.riya.com/ (several types of object or face detection searches)
http://www.attraseek.com/ (search for images like the one you uploaded)
http://www.pixsta.com/ (search for similar clothes to buy, using your uploaded image)
http://www.picollator.com/ (search for similar looking faces using Face Recognition on your uploaded image)
http://www.polarrose.com/ (automatically detect faces in your uploaded photos)


Cheers,
Shervin.
 

zen-r

Active Member
Messages
1,937
Reaction score
3
Points
38
Thanks for the links.

That will give me something to experiment with for a while! :)
 
Top