Academic and Industry Collaboration

At CoreLogic, we have an excellent track record of working together with higher education institutions to enable university student access to real-world data and provide them with career opportunities through internship or graduate-level research work. This enhances student learning experiences and facilitates talent pipelines for our organization.

Partnership with Chapman University

In 2017, CoreLogic began a partnership with the Chapman University, a top private California university to kick off a Machine Learning (ML) Challenge initiative, including three individual machine learning challenges to solve real world data mining problems. Twenty-nine students, divided into 10 teams, participated in the first ML challenge to offer innovative solutions on the topic of “Obstructed View and Sentiment Analysis.”

The second ML challenge was concerned with “Data Operations Automation.” Students were challenged to come up with novel solutions to categorizing document images through identification of the document source and type, along with creation of a file index for identifying documents using both Optical Character Recognition (OCR) and Natural Language Processing (NLP).

The third ML challenge was by far the most difficult one. Students had to apply both “Text Mining and Image Analysis” techniques to extract information from Multiple Listing Service (MLS) real estate property photos in order to identify the quality of construction materials, along with any other unique features or upgrades that can influence property values.

As a result of this partnership, Chapman University students had the opportunity to develop data science solutions with real-world data, after which several students were selected to join CoreLogic’s summer internship program. Quite a few interns were then converted to full-time employees to continue their professional careers.

Building on Success

Building on the success of the Machine Learning (ML) Challenge initiative, CoreLogic expanded our Chapman partnership to a Research Sponsorship.

The research effort was focused on using machine learning to accurately identify neighborhood boundaries from geocoded appraisal data. The result of this work has the potential to benefit CoreLogic across various data products, as correct identification of neighborhood is an important, financially-driven element in real estate.

It is well-known that the real estate industry uses ZIP (postal) codes and Census tracts as a source of land demarcation to categorize properties with respect to their value. These kinds of demarcated boundaries are static, and inflexible to shifts in the real estate market - failing to represent key market dynamics, such as a planned residential project. Delineated neighborhoods are also important in socioeconomic and demographic analyses where statistics are computed at a neighborhood level.

The CoreLogic/Chapman research demonstrates the feasibility of using the distance between subject properties and their comparable properties to delineate neighborhoods composed of properties with similar values and characteristics.

Research Publication

CoreLogic Data Scientists worked closely with the Chapman doctoral candidate researcher, to provide data, subject matter expertise, technical reviews, and research validation. The effort spanned one year to complete the research and proof of concept, which formed the foundation for the collaborators to jointly author a scientific paper for the International Journal of Geo-Information by the publisher – Molecular Diversity Preservation International (MDPI). The article went through extensive review and subsequently approved for publication in July, 2020.

Links to the online paper can be found here: https://www.mdpi.com/2220-9964/9/7/451

Conclusion

A partnership between academia and business to offer students an opportunity to solve real-world business problems is a win/win solution for both industries. Students get a chance to dive deeply into new domains and work with data that represents what they will find outside of pristine classroom environments, while organizations have an opportunity to tap into bright minds eager to demonstrate their creativity and innovation. The result of the CoreLogic/Chapman collaboration has increased awareness of our CoreLogic business on the Chapman campus, and has continued to attract high-performing students into our talent pipeline.

Stanly Wu
Stanley Wu
Sr. Leader, Data Science and Architect