Documents

For Media Mining, the Future is Now! (conclusion)

May 5, 2015

1/2
Download
Page 1 from For Media Mining, the Future is Now! (conclusion)
(U) Far Media Mining, the Future Is New! [canclusian] mom?and Human Language [323) Run Date: DBEUWEDUE. I) Media Mining Acrass a Wide Range bf Languages Cine bf the challenges in deplbying this Media Mining HLT is the need tb cbver the very brbad range bf languages. bf the languages bf interest tb the Agency are bf interest tb cbmmercial cbncems because they are likely tb be prbfitable, and businesses run an prbfit. Thbugh CUTS prbducts such as NEXminer have cbvered "dense" languages such as English and Spanish, and have made great inrbads lately intb a few less- languages and dialects fbund in the Middle East, it is unclear that any CUTS prbduct will ever cbver the vast bf languages that SA are required tb understand. Therefbre, the HLT PMD is develbping an enhancement bf this Media Mining that can prbcess bver 9D languages using a cbmbinatibn bf language-specific and universal phbnes. This agency cap ability, develbped within R64, the Human Language Research Grbup, is as Universal Phbnetic Recbgnitibn I) New languages can be easily added tb the by drawing bn Agency linguistic bf a language cbmbined with publicly available language resburces. As events shape bur language needs, UPR prbvides a way tb within minutes tb new language needs, example tb the GWCIT. (U) IVE: Technalagy that Can Separate the Wheat frani the Chaff I) A secbnd, equally imp brtant enhancement under develbpment is the ability far this HLT cap ability tb predict what intercepted data might be bf interest tb based an the past behavibr. Much like the way in which pbpular sites like are able tb track and predict buyer preferences, integratibn bf Intelligence Value Estimatibn (IVE) bn SRI and message cbntent, bffers the prbmise bf presenting with highly enriched sbrting bf their traffic. Imagine if ybu came tb each day knbwing that the best five intercepts needing transcriptibn were sitting at the tap bf ybur queue waiting ybu. [if cburse, such Media Mining IVE capabilities need be limited tb SRI and key searches. In cbllabbratibn with S2UEB, Analytic far the Enterprise, the HLT PMU Media Mining team is alsb develbping new metadata analysis cap abilities based bn language, speaker, gender, and dialect identificatibn, presenting this tb thrbugh cbnventibnal query such as UIS. Advanced like are integrating bther bf such as gebsp atial will alsb send autbmatic alerts tb when incbming intercept meets certain search criteria. (SHED VbiceRT will be integrated with standardAaency vbice such as UIS and will be able tb cbnfiaare the via the web, and access scares bn their traffic using NUCLEUN.
(U) Far Media Mining, the Future Is New! [canclusian] mom?and Human Language [323) Run Date: DBEUWEDUE. I) Media Mining Acrass a Wide Range bf Languages Cine bf the challenges in deplbying this Media Mining HLT is the need tb cbver the very brbad range bf languages. bf the languages bf interest tb the Agency are bf interest tb cbmmercial cbncems because they are likely tb be prbfitable, and businesses run an prbfit. Thbugh CUTS prbducts such as NEXminer have cbvered "dense" languages such as English and Spanish, and have made great inrbads lately intb a few less- languages and dialects fbund in the Middle East, it is unclear that any CUTS prbduct will ever cbver the vast bf languages that SA are required tb understand. Therefbre, the HLT PMD is develbping an enhancement bf this Media Mining that can prbcess bver 9D languages using a cbmbinatibn bf language-specific and universal phbnes. This agency cap ability, develbped within R64, the Human Language Research Grbup, is as Universal Phbnetic Recbgnitibn I) New languages can be easily added tb the by drawing bn Agency linguistic bf a language cbmbined with publicly available language resburces. As events shape bur language needs, UPR prbvides a way tb within minutes tb new language needs, example tb the GWCIT. (U) IVE: Technalagy that Can Separate the Wheat frani the Chaff I) A secbnd, equally imp brtant enhancement under develbpment is the ability far this HLT cap ability tb predict what intercepted data might be bf interest tb based an the past behavibr. Much like the way in which pbpular sites like are able tb track and predict buyer preferences, integratibn bf Intelligence Value Estimatibn (IVE) bn SRI and message cbntent, bffers the prbmise bf presenting with highly enriched sbrting bf their traffic. Imagine if ybu came tb each day knbwing that the best five intercepts needing transcriptibn were sitting at the tap bf ybur queue waiting ybu. [if cburse, such Media Mining IVE capabilities need be limited tb SRI and key searches. In cbllabbratibn with S2UEB, Analytic far the Enterprise, the HLT PMU Media Mining team is alsb develbping new metadata analysis cap abilities based bn language, speaker, gender, and dialect identificatibn, presenting this tb thrbugh cbnventibnal query such as UIS. Advanced like are integrating bther bf such as gebsp atial will alsb send autbmatic alerts tb when incbming intercept meets certain search criteria. (SHED VbiceRT will be integrated with standardAaency vbice such as UIS and will be able tb cbnfiaare the via the web, and access scares bn their traffic using NUCLEUN.
Page 2 from For Media Mining, the Future is Now! (conclusion)
(U) Bringing it All Tugether The integratiun bf these technulugies intu an autbmated system will bring twu majbr innuvatibns: faster respunse time and impruved pruductivity. Uur challenge gual is tu "indea, tag, and graph? all incuming intercept, and this w?l spun be within reach. Using HLT services, a single analyst w?l be able tu surt thrbugh milliuns bf cuts per day and fucus un unly the small percentage that is relevant. The amuunt bf cullectiun can be increased urders bf magnitude withuut further stressing the analyst pupulatiun, alluwing the Agency tb cast a much wider SIGINT net and taking in a much richer catch. I) And again, the puwer bf HLT is tmly realised thruugh integratiun bf multiple SIGINT technc-lc-gies. In the future, we will further develc-p technulc-gies such as wc-rd search tu suppc-rt cruss-lingual queries. Sites that lack esp ertise in a given language will be able tu issue queries in English and receive results translated frum the target language back intu English. This marriage bf wc-rd search and Machine Translatibn has great putential as a furce multiplier. Mapping meaning and tradecraft acruss languages will be a key challenge here. I) Similarly, because a search term w?l be tagged with a "semantic class identifier," such as "place name," it will be relatively tc- integrate this technc-lugy with the Enterprise Knuwledge System and alluw suphisticated cap abilities such as sucial netwurk analysis tu up erate un vuice cuntent. In the HLT PMCI lung-term visiun, will be able tu cunstruct cumplea queries, such as, "Where is the maybr bf Baghdad?? pr "Sth me all the intercept cuntaining abuut eaplusive devices that uccurred yesterday in the duwntuwn area bf Baghdad near the Al-Rashid Hutel," and ubtain answers directly in English, ur in their fureign language if they prefer, with a link tn the dbcuments cuntaining the answers. We are entering a gnlden age fur HLT. Puwerful and inexpensive cumputers, high- speed netwurking, and advanced algurithms are being cumbined tu revulutiuniae the analyst Eur mere infurmatiun abuut these cap abilities, please cuntact the HLT PMCI uffice ["gu HL pr call .
(U) Bringing it All Tugether The integratiun bf these technulugies intu an autbmated system will bring twu majbr innuvatibns: faster respunse time and impruved pruductivity. Uur challenge gual is tu "indea, tag, and graph? all incuming intercept, and this w?l spun be within reach. Using HLT services, a single analyst w?l be able tu surt thrbugh milliuns bf cuts per day and fucus un unly the small percentage that is relevant. The amuunt bf cullectiun can be increased urders bf magnitude withuut further stressing the analyst pupulatiun, alluwing the Agency tb cast a much wider SIGINT net and taking in a much richer catch. I) And again, the puwer bf HLT is tmly realised thruugh integratiun bf multiple SIGINT technc-lc-gies. In the future, we will further develc-p technulc-gies such as wc-rd search tu suppc-rt cruss-lingual queries. Sites that lack esp ertise in a given language will be able tu issue queries in English and receive results translated frum the target language back intu English. This marriage bf wc-rd search and Machine Translatibn has great putential as a furce multiplier. Mapping meaning and tradecraft acruss languages will be a key challenge here. I) Similarly, because a search term w?l be tagged with a "semantic class identifier," such as "place name," it will be relatively tc- integrate this technc-lugy with the Enterprise Knuwledge System and alluw suphisticated cap abilities such as sucial netwurk analysis tu up erate un vuice cuntent. In the HLT PMCI lung-term visiun, will be able tu cunstruct cumplea queries, such as, "Where is the maybr bf Baghdad?? pr "Sth me all the intercept cuntaining abuut eaplusive devices that uccurred yesterday in the duwntuwn area bf Baghdad near the Al-Rashid Hutel," and ubtain answers directly in English, ur in their fureign language if they prefer, with a link tn the dbcuments cuntaining the answers. We are entering a gnlden age fur HLT. Puwerful and inexpensive cumputers, high- speed netwurking, and advanced algurithms are being cumbined tu revulutiuniae the analyst Eur mere infurmatiun abuut these cap abilities, please cuntact the HLT PMCI uffice ["gu HL pr call .