This project is an effort to bring ArcGIS Online onto the Semantic Web. Metadata from ArcGIS Online have been converted into the form of Resource Description Framework (RDF) following the principles of Linked Data. The converted ArcGIS Online Linked Data contains information about ArcGIS Web maps, Web applications, layers contained in each map, Map services, Feature services, as well as ArcGIS users who have created those maps. We expose this sample data in a SPARQL endpoint, and all the functions in this demo directly interact with this SPARQL enpoint. A sample SPARQL query can be found here.
Specifically, we designed four modules to demonstrate the capabilities of the semantically annotated data.
Semantic Search: This module searches the results based on semantic reasoning. We expand the user input queries with terms which has similar semantic meaning and related geospatial locations. Thus, a search such as "natural disasters in utah" will return not only maps that contain the keywords "natural", "disaster", and "utah", but also the maps about flood, earth quake, tornado and other maps that do not have the keyword "natural disaster". Similarly, a search of "new york water" will not only return the maps that contain the keyword "water" but also maps about lakes and rivers in New York. As for geospatial extension, a search of "California population density" will also return maps about "Los Angeles population density" since the system will understand that Los Angeles is a city in California. The search speed is not fast at this moment (a search may take 30s). This is due to the external services from DBpedia Spotlight and OpenCalais used to dynamtically expand the query. The speed can be significantly increased by integrating the external services and cache the results for frequently search terms.
Knowledge Discovery: This module shows the new knowledge that can be discovered using Linked Data. For example, we can find the users who are using a particular basemap. This knowledge is useful when such a basemap needs to be updated by the system, and an automatic email system can send message to remind those users that their web maps may be temporarily unavailable. For another example, we can find the popular basemaps by summarizing the numbers that each basemap has been used. The result is different from another definition of popularity which is based on view times (i.e., how many times this basemap has been viewed by people). We have provided both of the two queries, and interested users can check them out and see the difference. In the group summary queries, we can quickly find out the number of web maps created by each group, and we can perform the summarization based on space and time. Finally, we also provide queries for the user, group, and map interactions. One can, for example, first search a Web map, and then check the group that contains this Web map, then find out the owner of the group, and then find other Web maps in this group.
GeoSPARQL: This module demonstrates the capability of OGC's GeoSPARQL in supporting geospatial queries for Linked Data. Users can select a search type (e.g., Web Map), type in a thematic term (e.g., fire), specify a geospatial area (e.g., California), and then click search to find the results. By clicking at the images of the result, users can also see a detailed information table listing all the attributes. This table also shows all the links that the target object has towards other objects. Following these links and using the "Back" and "Forward" button, users can find more interesting results. At the bottom of the search page, there is also a map that displays all geospatial extents of the search results.
Statistics: This module provides a summary of all the exported and converted ArcGIS Online data. Users can check the number of data by types (e.g., Web map, map service, and feature service). Users can also summarize the data by their timestamps (i.e., what is the time span for each type of data).
This is still a prototype, and your comments are very welcome. More details can be found in the slides ArcGIS Online as Linked Data.pptx
SPARQL endpoint: Parliament, which integrates Jetty (servlet container), Jena and ARQ (query processor), Joseki (implementation of the SPARQL protocol), Berkeley DB (resource dictionary).
RDF Converter: Java as programming language and Apache Jena API for creating RDF statements.
Ontology design:CMap and Protege.
Data mining: ArcGIS Online REST API.
Developers: Yingjie Hu and Sathya Prasad
Applications Prototype Lab | Esri