(Register for the course in Korppi system. Notice: late registrations are also possible, just contact the teacher)
ITKS-5440: Semantic Web and Linked Data (5 ECTS)
Artificial Intelligence has two major sides, which complement each other: the Bottom-Up AI driven by Machine (also Deep) Learning; and the Top-Down AI driven by formalized human knowledge. This course (ITKS-5440: Semantic Web and Linked Data) corresponds to the Top-Down AI family of approaches and includes an introduction and practical tutorial on the RDF-based semantic annotation of Web resources and services for the Semantic Web, Linked Data and Ontology Engineering; and also review some modern applications of these methods and techniques for Web-based intelligent applications and services.
Main Content Components
Semantic Web mission; concepts of semantic interoperability, integration and automation; concept of metadata and ontology; Semantic Web standards; RDF (Resource Description Framework); Linked Data; Ontology Engineering; OWL (Web Ontology Language); Rules for inferring knowledge; SWRL (Semantic Web Rules Language); Semantic Technology; Semantic (Web) Applications and Services; Relation to Big Data and Industry 4.0.
Course-Related Context and Motivation:
The Semantic Web is originated from Semantic Computing which is an emergent field of Computing. It is a collaborative ongoing activity led by the World Wide Web Consortium (W3C) to promote common data formats on the World Wide Web specifically for machine-processable and machine understandable data aiming to convert the current web, dominated by unstructured and semi-structured documents, into a "web of data" (often referred as Web 3.0). The Semantic Web stack builds on the W3C's Resource Description Framework (RDF). Publishing machine-understandable data on the web is going as a mainstream. Linked Data (the activity originated from the Semantic Web vision) has seen explosive growth over the past few years. Linked Data assumes publishing structured data so that it can be interlinked with standard Web technologies such as HTTP, RDF and URIs, aiming to share information in a way that can be read automatically by computers. This enables data from different sources to be connected and queried. For example, DBPedia is a collection of data structured in RDF after being extracted from the Wikipedia, which allows Semantic Web-based applications to automatically infer implicit or new data and make advanced queries over the Wikipedia-derived dataset. The FOAF (Friend-of-a-Friend) is another example of how the Semantic Web attempts to make use of the data about people and their relationships within a social context. Organization of data based on RDF (graph) model makes it possible to connect data from distinct heterogeneous sources, organize and query huge volumes (Big Data challenge) of data. Ontologies are helpful to provide interoperability among various schemas used in the data and enable applications automatically discover and explore new previously unknown sources of data. Semantic-Web-standards-driven so-called Semantic Technology as a software technology allows the meaning of information to be known and processed at execution time of various applications making them naturally interoperable in the Web and within various digital ecosystems and clouds. Therefore as a summary: the Semantic Web is an evolving development of the World Wide Web in which the meaning (semantics) of information and services published on the Web and their inter-relationships are explicitly defined, making it possible for the Web-based software tools, agents, applications and systems to discover, extract and “understand” Web information resources and capabilities and automatically utilize it. Related to these, the Linked Data activity aims to expose, share, and connect distributed pieces of data, information, and knowledge; to extend the Web by publishing various open datasets and by setting semantic links between data items from different data sources. The Semantic Web vision assumes annotating Web resources with machine-interpretable descriptions (metadata) referred to shared conceptual vocabularies (ontologies), and provides mechanisms for automated reasoning about them.
Relation of the course with Master Programs of the MIT Department:
Master Program on Web Intelligence and Service Engineering (WISE) and new International Master Program on Cognitive Computing and Collective Intelligence (COIN) are natural places for such course because these Programs’ Mission summarized as “Everything-as-a-Service Engineering” (including Deep Learning, Big Data analytics and Web-based Cognitive Computing capabilities as services) requires the Semantic (Web) Technology to enable self-management and to handle heterogeneity of information, technology capabilities and users. Learning outcomes of this course are assumed to be an input to several other courses of the WISE and COIN programs (e.g., Deep Learning for Cognitive Computing, SOA and Cloud Computing; Design of Agent-Based Systems; Collective Intelligence and Agent Technology; Interface of Things; Big Data Engineering and others).
Among other Master programs the closest one is Data Analysis (or similar) program as the course provides the framework and advertises tools for machine-processable data in the Web.
The course is also suitable for the Cyber Security (or similar) Master Program as it is known that the so-called "Web of Trust" is one of the ultimate goals of the Semantic Web. Research on the topic of trust in this domain has focused largely on digital signatures, certificates, and authentication as well as trust in social networks.
The Software Engineering Master Program can benefit from the semantic technology, semantic programming, semantic applications, self-managed systems engineering, and the open world assumption in software design originated from the Semantic Web vision and based on appropriate standards.
The best evidence on having the ITKS-5440 course naturally relevant to most of master programs of the MIT department (e.g., Data Analysis, Cyber Security, variations of Computational Science, and others) is given by Amit Sheth (h-index > 80) in as follows: “Semantic computing is a vision of computing based on semantics shared between machines and people. It supports and exploits intrinsic, intended, and emergent meanings (content) in all aspects of computing, encompassing programming, algorithms, information management, and human interactions within devices, as part of communications, and across the Web. Semantics involves the use of formal descriptions, languages, and models, often encoded in metadata, knowledge, and representation of agreements (as in ontologies) to capture the content of multimedia, texts, services, and structured data so that it may be extracted, shared, synthesized and transformed. Semantic techniques foster the development emerging forms of computing, such as semantic Web, and entirely new forms, such as bio-inspired computing, as well as enhance traditional techniques of information retrieval, management of data (including multimedia and multimodal) and artificial intelligence (e.g., natural language processing machine learning, and computational intelligence), leading to more efficient and scalable information processing and higher-quality computer-human interaction.”
create your CV as part of your personal Web page preferably in your personal Web space provided by the university account, for example:
· create ontology with OWL (using Protégé ontology editing tool) needed for describing humans, entities, organizations, events, records, abstractions, etc., mentioned in your CV (or CVs like yours);
· semantically describe (annotate) yourself as a Web resource (with unique URI) following the story presented in your CV (together with other resources mentioned in it: people, universities, schools, companies, places, skills, files, documents, records, etc.) using RDF (link yourself with other relevant Web resources or physical World resources according to the ontology created in Protégé). In Protégé, the semantic annotation process means just creating a new instance in appropriate class and feeling with data (put values to slots) the form prepared by the ontology;
· please, do not provide any private/sensitive information, which is not meant to be shared in the Web;
· it will be appreciated if some (the more – the better) of these “other resources” in the neighbourhood of the target person will be found in and connected with other well-known open metadata repositories, such as, e.g., DBPedia, FOAF, etc.;
· it is supposed also that the group of “other resources” will include various types of media files (relevant texts, photos, videos, etc.) available in the Web;
· for doing the task above, please download and install version Protégé 3.5 from: http://protege.stanford.edu including Java VM;
· when creating a new project with Protégé select OWL/RDF files (by this way Protégé will combine in the same OWL file both: the ontology and the RDF semantic annotations);
please be very careful by specifying your ontology URI. It should correspond exactly to the Web URL of the ontology (depends on your personal Web space). For example:(!)
· when you save your project for the first time make sure that the name of the file will correspond to the one from the URI (e.g.: cv-ontology.owl for the example above);
· notice that Protégé in addition to OWL file (e.g., cv-ontology.owl ) will also create Protégé-specific files PPRJ (cv-ontology.pprj) and REPOSITORY (cv-ontology.repository);
· when finished working with Protégé do not forget to upload all three files (cv-ontology.owl , cv-ontology.pprj and cv-ontology.repository ) to your personal Web space so that their URLs will look like:
· provide report (e.g. in DOC file, where name of file is student’s capitalized family name), which consists at least of: (A) full name of the student; (B) name of the course; (C) the URL of the original CV (e.g., ); (D) the 3 links (e.g., ) to your Protégé files; and (E) the Conclusion;
· In “Conclusion” part of the report please write your opinion, by which possible way (by what kind of applications) the semantic annotation of yourself according to the CVs and the appropriate ontology can be used;
· Do not remove appropriate files from the Web until final decision will be made;
· Files with report should be sent by e-mail to Vagan Terziyan until 15 November;
· Notification of evaluation - until 25 November.