Documents

Finding Nuggets – Quickly – in a Heap of Voice Collection, From Mexico to Afghanistan

May 5, 2015

1/2
Download
Page 1 from Finding Nuggets – Quickly – in a Heap of Voice Collection, From Mexico to Afghanistan
{SllS Finding Nuggets -- Quickly -- in a Heap of Voice Collection, From Mexico to Afghanistan FRUM: (was-Um? Senior Technical Development Program -- Class of and Intelligence Analysis Technical Director, SA Texas Run Date: D5l25l2?11 Recently I had a rare life-changing instance where the highly unexpected occurred, a so?called "black swan" event.* What was this event? It was my exchange with the Human Language Technology (HLT) Division in the Research Directorate and with many other HLT believers. (U) Before submerging intb HLT as my Senior Technical Development Program goal, Ihad no clue what HLT was. If asked, I would have defined the Human Language Technology concept as the ability of linguists to use the available technology to search for reportable intelligence from voice traffic. Speaker identification was no more than "yes, that's the guy! I imagined. Language identification was even simpler: Spanish, Spanish; and keyword search, what's a HEEL) lvly definition of HLT has changed through the years. The Human part involves a lot more than simply understanding the target language, given the need for a language analyst to know customers' requirements, collection techniques, global networks bsitibn, and legal rules. As for the Language, it is no longer enough to know what was said in speech or conveyed in writing. In order to determine intent, the analyst must know how things were said, in what tone and accent, the mood of the person, vocabulary usage, religious and political beliefs, nationality, type of device used, communication patterns, associations, and the location of the speakers to the nearest cell tower! Then there is Technology, which has given a more comprehensive view of their target space and has also made that target space larger and impossible to navigate "a cappella." HEEL) Tb successfully track thbse elusive and always-mobile targets, we must find them, regardless bf which communication method they happen to be using. What we db not know is how many wrong-dbers we might have overlooked or how many illegal operations we might have failed to uncover because bf the volumes bf data that we were not able to scan. That's where Human Language Technology can help: it can find the exact traffic of interest within a mass of collection. Tb sell the technology to those who would benefit from it, we need to convince a veteran linguist that a computer will actually find a target of interest using a statistically generated vbice model, even if the phone number has changed! It's not an easy sell. We must also change the way we process and analyse collected intelligence. However, the HLT organisation (HIST, formerly R64) is helping embrace the HLT concept and is enabling them to see further into their overstuffed queues. Thanks to the support of the SCS Director, researchers from R64 accompanied by this lbne S2 language analyst, delivered HLT analytics to F6 sites in the Americas; the immediate success of the first HTL-L abs system was possible not just because the technology was mature enough, but because visionary leaders --with at the tbp-- and bp en-minded field have accepted both the virtues and the flaws of this revolutionary technology. SCS Spanish Language Voice have learned to exploit the advantages of speech-tb-text keyword search, and have quickly integrated speaker, gender, and language recognition into their NSA-Texas who were part of the initial testing and
{SllS Finding Nuggets -- Quickly -- in a Heap of Voice Collection, From Mexico to Afghanistan FRUM: (was-Um? Senior Technical Development Program -- Class of and Intelligence Analysis Technical Director, SA Texas Run Date: D5l25l2?11 Recently I had a rare life-changing instance where the highly unexpected occurred, a so?called "black swan" event.* What was this event? It was my exchange with the Human Language Technology (HLT) Division in the Research Directorate and with many other HLT believers. (U) Before submerging intb HLT as my Senior Technical Development Program goal, Ihad no clue what HLT was. If asked, I would have defined the Human Language Technology concept as the ability of linguists to use the available technology to search for reportable intelligence from voice traffic. Speaker identification was no more than "yes, that's the guy! I imagined. Language identification was even simpler: Spanish, Spanish; and keyword search, what's a HEEL) lvly definition of HLT has changed through the years. The Human part involves a lot more than simply understanding the target language, given the need for a language analyst to know customers' requirements, collection techniques, global networks bsitibn, and legal rules. As for the Language, it is no longer enough to know what was said in speech or conveyed in writing. In order to determine intent, the analyst must know how things were said, in what tone and accent, the mood of the person, vocabulary usage, religious and political beliefs, nationality, type of device used, communication patterns, associations, and the location of the speakers to the nearest cell tower! Then there is Technology, which has given a more comprehensive view of their target space and has also made that target space larger and impossible to navigate "a cappella." HEEL) Tb successfully track thbse elusive and always-mobile targets, we must find them, regardless bf which communication method they happen to be using. What we db not know is how many wrong-dbers we might have overlooked or how many illegal operations we might have failed to uncover because bf the volumes bf data that we were not able to scan. That's where Human Language Technology can help: it can find the exact traffic of interest within a mass of collection. Tb sell the technology to those who would benefit from it, we need to convince a veteran linguist that a computer will actually find a target of interest using a statistically generated vbice model, even if the phone number has changed! It's not an easy sell. We must also change the way we process and analyse collected intelligence. However, the HLT organisation (HIST, formerly R64) is helping embrace the HLT concept and is enabling them to see further into their overstuffed queues. Thanks to the support of the SCS Director, researchers from R64 accompanied by this lbne S2 language analyst, delivered HLT analytics to F6 sites in the Americas; the immediate success of the first HTL-L abs system was possible not just because the technology was mature enough, but because visionary leaders --with at the tbp-- and bp en-minded field have accepted both the virtues and the flaws of this revolutionary technology. SCS Spanish Language Voice have learned to exploit the advantages of speech-tb-text keyword search, and have quickly integrated speaker, gender, and language recognition into their NSA-Texas who were part of the initial testing and
Page 2 from Finding Nuggets – Quickly – in a Heap of Voice Collection, From Mexico to Afghanistan
validatinn bf HLT analytics helped verified their utility, and the successes multiplied. finding tunnels in Tijuana, identifying threats in the streets bf City, er shedding light an the shunting bf US nfficials in Menicn, the did what it advertised: it accelerated the pracess af finding relevant intelligence when time was af the essence. [See related article.) I did net eI-tpect tn find myself the explaining the tn military leaders translatnrs at the Afghanistan Remnte Up eratinns Center (ARCHIE) tn research and analysis skills, but then a ain black swan events are unpredictable. DIRNSA appninted nne bf his military below with author in Kandahar) dedicated an exemplary Army nfficer-- tn lead the bf HLT analytics tn Afghamstan. Frc-m Kandahar tn Kabul, we have traveled the cnuntry eaplaimng SA leaders' visinn and SIGINT teams tn what HLT analytics can db tnday and tn what is still needed tn make this a game-changing success. While the challenges bf the Afghanistan language missinn are net insurmnuntable, it will take many menths far HLT analytics tn reach the same level bf as the Spanish systems deplnyed tn SCS. The ARDCC and SEE missinns are different but the underlying is the same: enc-ugh talented language tn pracess everything we callect. With a cnmmitment tn the speech-tn-tent functinnality, with supp art S, T, SA Genrgia, the ARDCC and NCR leadership, and mere with the keen interest bf every language analyst in theater tn learn abnut and use HLT tabls, in time this will likely redefine the way speech is precessed in the SIGINT missinns bf Afghamstan. Analytic mbderniaatibn is as much abnut as it is abbut penple; understanding the needs bf and shaping technc-lngy tn enable them tn succeed is perhaps the must satisfying result, whether presecuting narcntics traffickers in Met-{ice er Taliban leaders in Afghanistan. (U) Nntes: (U) The term cc-mes Nassim Nichnlas Taleb's bank "The Black Swan."
validatinn bf HLT analytics helped verified their utility, and the successes multiplied. finding tunnels in Tijuana, identifying threats in the streets bf City, er shedding light an the shunting bf US nfficials in Menicn, the did what it advertised: it accelerated the pracess af finding relevant intelligence when time was af the essence. [See related article.) I did net eI-tpect tn find myself the explaining the tn military leaders translatnrs at the Afghanistan Remnte Up eratinns Center (ARCHIE) tn research and analysis skills, but then a ain black swan events are unpredictable. DIRNSA appninted nne bf his military below with author in Kandahar) dedicated an exemplary Army nfficer-- tn lead the bf HLT analytics tn Afghamstan. Frc-m Kandahar tn Kabul, we have traveled the cnuntry eaplaimng SA leaders' visinn and SIGINT teams tn what HLT analytics can db tnday and tn what is still needed tn make this a game-changing success. While the challenges bf the Afghanistan language missinn are net insurmnuntable, it will take many menths far HLT analytics tn reach the same level bf as the Spanish systems deplnyed tn SCS. The ARDCC and SEE missinns are different but the underlying is the same: enc-ugh talented language tn pracess everything we callect. With a cnmmitment tn the speech-tn-tent functinnality, with supp art S, T, SA Genrgia, the ARDCC and NCR leadership, and mere with the keen interest bf every language analyst in theater tn learn abnut and use HLT tabls, in time this will likely redefine the way speech is precessed in the SIGINT missinns bf Afghamstan. Analytic mbderniaatibn is as much abnut as it is abbut penple; understanding the needs bf and shaping technc-lngy tn enable them tn succeed is perhaps the must satisfying result, whether presecuting narcntics traffickers in Met-{ice er Taliban leaders in Afghanistan. (U) Nntes: (U) The term cc-mes Nassim Nichnlas Taleb's bank "The Black Swan."