For the past year, we’ve been traveling around the world collecting speech data that’ll be used to develop in-car navigation systems in multiple languages. Luckily, we were able to collect a large chunk of the data in our beautiful hometown of Vancouver, BC. Being located in such a diverse city is a huge benefit for projects that require native language speakers – people worldwide flock to Vancouver for its mountains, seaside living, and fresh air. In fact, for some smaller-scale projects, we’re able to find enough native language speakers without boarding an airplane at all.
This project on the other hand, was pretty massive. We met and conducted recording sessions with over 2,000 participants, collecting thousands of hours of speech data while we were at it. It also required us to travel to a total of nine countries, including: Poland, Turkey, Korea, Germany and Brazil. Not that we minded, of course! Any excuse to travel and experience different food, culture and people is good enough for us.
In-Car Speech System Data Collected In Cars
Because speech systems (whether they are used inside a car, in your home, or on your phone) respond to, well, speech, we have to collect the most authentic speech data possible. This means conducting data collection in the actual environment for which the system will be used; in the case of this project, inside cars. Replicating the environment is important in ensuring the technology will not respond to a horn honking in the background, a truck speeding loudly by on a highway, or to the ambient chatter of your friends in the backseat. Indeed, not only do you have to teach the speech recognition technology what commands to respond to, but also what sounds not to respond to.
Participants were tasked with repeating a pre-determined list of phrases as well as asking things of the in-car system naturally. This is because there is more than one way of saying things. For instance, “Play We Are The Champions, please” versus “Can you turn on We Are The Champions by Queen?”. If the speech system only understands one version of a command, then you’re going to end up with some very frustrated drivers. There’s a lot more that goes into data collection methods for emerging technology than you might think!
Dealing with the Unexpected
Natural language processing projects like these are a big part of our operations. As most of these projects serve R&D purposes, they involve a lot of unknowns. We love the challenges they bring and the new opportunities to innovate. Thinking outside of the box and being creative is the only way to succeed in projects like these.
Needless to say, we gained some new skills along the way. We learned to be creative with how to warm up and cool down cars in extreme weather conditions (ranging from -11ºC in China to 38ºC in Brazil) without creating noise that would be picked up by our microphones. We became experts in detecting and solving static noise issues. We learned how to make participants from different cultures feel comfortable sitting in a car with complete strangers. We grew a sincere appreciation for Fedex, who did an amazing job of moving our equipment from one country to the next. When they weren’t an option, we developed some solid persuasion skills that were quite handy when we had to convince airport security and border patrol that our six computers, microphones and other sound devices weren’t going to be used for a secret spy mission.
An End to a Year-Long Project
While we certainly feel accomplished having completed such a large and complex project, we’re also sad to see it come to a close. Traveling for a living and having the fortune to meet such wonderful, talented and hardworking people all around the world is one of the reasons we love what we do.
If you’re interested in what we do or want to learn more about speech data collection in Vancouver or elsewhere around the world, reach out to us. Or, dig around our website and read more about our data collection services.