The architecture is based upon five internal components that support the following process:
- The Service Crawler (SC) obtains the available services and related information by crawling HTML and PDF documents from the Web;
- The Automatic Annotator (AA) receives the crawled data and enriches it with annotations according to the Service-Finder ontology and the Service Category ontology;
- The Conceptual Indexer and Matcher (CIM) receives and integrates all the information into a coherent semantic model based on the ontologies and provides reasoning and query capabilities.
- The Service-Finder Interface (SFP) provides the user interface for searching and browsing the data managed by the CIM. It also enables users to contribute information in a Web 2.0 fashion by providing tags, categorizations, ratings, and comments to the data browsed.
- The Cluster Engine (CE) analyses user behaviour in interacting with the SFP (user click streams) in order to provide users with recommendations.
Besides the five internal components, Service-Finder relies on some features provided by the seekda portal, such as:
- list of meaningful URLs to start the crawling
- ranking values for services
- availability graphs and an on-line invoker to test services.
The Service-Finder Interface contributes to seekda and to the Service Crawler list of URLs for new services, which were suggested by users. These URLs can be considered in the forthcoming crawls.
The final architecture also supports the following interactions.
- Accessing external UDDI registries to gather more service URLs into the crawler;
- Considering user feedback to improve the quality of the automatically annotated data for service categorization and document categorization;
- Suggesting recommendations to users based on their behavior in interacting with the portal.
- Showing statistics to users such as the most frequent queries, tags, or categories or the most popular services and providers;
- Accessing Service-Finder data as an ebXML or UDDI registry to improve the exploitation channel Service-Finder;
- Providing Service-Finder data as RDFa in order to contribute to the Web of Data with machine-readable Linked Data.