The Forward Data Lab focuses on data-- integrating, managing, searching, and mining of data-- for data everywhere, in databases, then on the web, and now all over our social universe.
Without doubt, data is an indispensable ingredient for enabling algorithmic productivity and intelligence-- and we are fortunate to immerse in a digital world with unprecedented availability of data. However, to unlock the potential of data, we are facing many barriers, and our Forward Data Lab enjoys tackling these challenges. Our research overall aims at bridging structured and unstructured big data--- to bring structured/semantic-rich access to the myriad and massive unstructured data which accounts for most of the world's information. Therefore, our research spans across data mining, data management/databases, information retrieval, machine learning, with current efforts focusing on interactive data management, social media analytics, and social network mining, and entity-centric Web search and mining.
DataSpread: Enabling Interactive Big Data Management. (2015 - Present) We aim to integrate the two disparate paradigm of accessing tabular data-- database and spreadsheet-- through their marriage to enable interactive access at the front-end to power query and storage engine at the backend. (Demo: VLDB'15) ![]()
BigSocial: Towards Big Social Data Platform for Entity-Centric and User-Aware Analytics. (2012 - Present) As we people are now connected in social networks and our voices are now heard via social media, we aim to exploit these new and vast “human sensors” prevalent in our digital society-- to listen to the whole world and make sense of it [SIGIR'12, KDD'12, VLDB'12, ICDE'13b,VLDB'13a, VLDB'13b, EDBT'14, WWW'14, ICML'14, KDD'14, BigComp'15,IJCAI'15, VLDBJ'15, AAAI'16, ICDE'16] (Demos: ICDE'12, ICDM'15) ![]() Supported by NSF Award 1619302, $500,000, III: Small: Social Discovery of Users and Content in Social Media Through Similarity-Based and Graph-Based Inference of Attributes and Queries. PI: Kevin C.C. Chang. News. Selected Publications
WISDM: Web Indexing and Search for Data Mining. (2007 - Present) The Web has gone far beyond a corpus of pages-- it contains all sorts of "stuff", can we search the Web for every "thing"- entities and their relations- that it contains?[CIDR'07,VLDB'07,WSDM'10, ![]() Online Demo. Entity Search (Prototype system over 500-million English pages in the ClueWeb09 corpus, for 10+ entity types, running on a PC cluster.) Example queries: 1) Google founder #person; 2) bird flu #country ; 3) high blood pressure treatment #drug ; 4) kevin c chang #email.Selected Publications
MetaQuerier: Exploring and Integrating the Deep Web. (2001 - 2007) The Web has deepened dramatically- A significant and increasing amount of information is now hidden on the "deep Web," behind the query interfaces of searchable databases, can we enable access and integrate such dynamic data? [KDD'02, ICDM'02, SIGMOD'03, SIGMODRecord'04, SIGMOD'04, KDD'04,TKDE'04, CIKM'04, VLDB'05, CIDR'05, KDD'05, TODS'06, CACM'07, VLDB'07, CIKM'08) (Demos: SIGMOD'04, SIGMOD'05, ICDE'05, ICDE'07)
Selected Publications
AIM: Supporting Efficient Top-k Ranked Query Processing-- AIMing for top query answers. (2001 - 2007) Our goal is to support ranked queries, or top-k queries, for matching data by "soft" conditions such as similarity, relevance, or preference, in order to return best k answers. [SIGMOD'02, SIGMOD'05, SIGMOD'06a, SIGMOD'06b, ICDE'07, TODS'07, TKDE'07, SIGMOD'07a,SIGMOD'07b, TODS'08] (Demos: VLDB'05, SIGMOD'07)
Selected Publications
|