The ContX team’s vision is to build a service that will make diverse contextual information available to Amazon Advertising and partner with ad programs to launch ad products based on contextual targeting. Contextual information will include keywords extraction, web page classification, topic modeling, page layout information and product assignment. It will cover text, image and video based extraction and extraction from mobile apps. In addition to solving core extraction problems, the team will build crawl infrastructure and address problems like crawl scheduling, URL normalization, sitemap generation, webpage de-chroming, app crawler etc., which are necessary for building a large-scale contextual extraction system.
The contextual service built by the team will enable using contextual data for keyword based and product based targeting, improve ad performance by providing contextual signals for modeling and provide webpage/app classification as a solution. The extraction platform can be used in a self-service mode to spin off multiple text mining products like synonym generation, lookalike modeling, ad position detection etc. The team will make the crawler service available as a managed framework with ability to introduce custom plugins. The end-to-end system will scale to support crawling and extraction of hundreds of millions of URLs and tens of thousands of apps.
We are looking for Scientists who have strong depth and breadth knowledge in text mining, information retrieval, information extraction, natural language processing and machine learning. You have reasonable programming and design skills to manipulate unstructured and big data and build prototypes that work on massive datasets. You will be applying business knowledge to perform broad data analysis as a precursor to modeling and to provide valuable business intelligence.
Amazon.com offers competitive salary, stock grants, health and other benefits.