From Dirt to Shovels: Automatic Tools Generation from Ad Hoc Data 报告人:Kenny Zhu 时间:5月11日下午3:00 Abstract: analysis and transformation tools are readily available. Such data is pervasi ve in many areas such as scientific repositories, financial data, system logs and configs, sensor outputs, etc. In this work, we demonstrate that it is poss ible to generate a suite of useful data processing tools directly from the ad hoc data itself, without any human intervention, and thus improves the product ivity of data analysts. The key technical contribution of the work is a multi-phase algorithm that aut omatically infers the structure of an ad hoc data source, and produces a forma t specification in a declarative language called PADS. Such specifications can be used to generate printing and parsing libraries as well as other useful to ols for processing the data. At the end of the talk, I will briefly introduce a few exciting new ideas in some on-going work that further improve the produc tivity of ad hoc data users. Bio: He graduated with B.Eng in Electrical Engineering and Ph.D in Computer Science , both from National University of Singapore. Prior to joining Princeton in 20 07, he was a software design engineer at Microsoft in Seattle. Kenny's main re search interests are languages and systems for data processing, artificial int elligence and concurrent/distributed systems. He has published in top-tier con ferences such as POPL, SIGMOD, ICDE and ICLP, and has been actively reviewing for various conferences and journals. His current research is centered around the PADS data description language. |
