ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, London, UK, August 2018, 1675-1684. TruePIE: Discovering Reliable Patterns in Pattern-Based Information Extraction. Qi Li, Meng Jiang, Xikun Zhang, Meng Qu, Timothy Hanratty, Jing Gao, Jiawei Han. Research in this project has been integrated into the PI's data mining courses and seminars at UB via course projects and lectures. Majority of the research results in their dissertations were from this project. In particular, two PhD students received "Best PhD dissertation award" in the Department of Computer Science and Engineering, University at Buffalo in 20 respectively. Their research skills have been greatly improved through this project, as demonstrated by their publications in top conferences and journals. This project trained six PhD students, one master student and seven undergraduate students including three female students and one African American student. ![]() The PI also discussed this research with high school students and undergraduate students at various outreach activities that promote "Women in STEM" at UB. The PI gave invited talks on workshops, in industrial labs and universities to present the research results of this project. In this project, we conducted an extensive survey on truth discovery, which was published in SIGKDD Explorations, and we presented an overview of the truth discovery field in several tutorials on VLDB, KDD, SDM and CIKM conferences. Research results from this project were presented on the top conferences in the data science field, including KDD, VLDB, SIGMOD, SDM, ICDM, and CIKM. These approaches can potentially benefit any other application in which decisions have to be made based on the reliable information extracted from diverse and heterogeneous sources. ![]() The effectiveness of the developed approaches was demonstrated on a variety of datasets drawn from multiple application scenarios, including crowdsourcing question answering, Internet information fusion, weather forecast integration, drug side-effect discovery, air quality monitoring, and indoor floorplan construction. We also modeled correlations among sources and objects, derived fine-grained reliability degrees of sources and confidence degrees of the truths, considered the existence of true claims in the data set, and provided geometric interpretations of the truth discovery approach. Specifically, we developed novel truth discovery methods for data of heterogeneous data types, data with long-tail distributions, streaming and time series data, distributed data and textual data. This project contributes to the development of this emerging field by developing truth discovery and information integration techniques that tackle unsolved challenges in this task. This significantly improves the data aggregation performance by exploring the wisdom in the minority. ![]() ![]() Truth discovery can detect the truth even when truth is in the hands of the few if the few are reliable sources. The traditional conflict resolution approach that conducts majority voting usually fails as sources may have different reliability levels. When conflicting information from multiple sources is collected, it is important to find reliable sources and identify the truth fact. Truth discovery is an emerging field in the data management and mining community.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |