Thesis
Toward a self-structuring software library: Nested self-organising maps for retrieving and browsing reusable software components
Southern Cross University, School of Multimedia and Information Technology
Doctor of Philosophy (PhD), Southern Cross University
2000
Metrics
19 Record Views
Abstract
One of the major challenges in software reuse is how to construct software libraries that will facilitate efficient and effective retrieval of the required software components. Reusable software components stored in a library include not only source code units, but also other assets, such as specifications, architectures, designs, test cases. Software library systems based on current approaches are either ineffective in retrieval, or inefficient in building and updating the libraries. Such systems may also be difficult for the user to use them as considerable know ledge about the inner working of the systems is required.
This thesis presents an approach called Visualised Software Library (VSL) for the structuring of software libraries. Software documentation is used as a surrogate of software components in VSL. For the reuse purpose, the most significant information of a software component is its functionality which is usually described in an associated document. To achieve an optimal balance among effectiveness, efficiency and user-friendliness, VSL combines the strengths of the automatic free-text indexing method and the Self-Organising Map (SOM) neural network technology. The proposed VSL approach consists of four major modules:
• representation scheme,
• classification scheme,
• retrieval mechanism, and
• browsing facility.
The representation scheme integrates the automatic free-text indexing method with a feature selection technique to characterise a software component collection. This is done by analysing the documentation associated with the collection to identify the key features in the collection. The key features are selected in two steps. The first step selects those features that characterise the functionality of the individual components. The second step further screens the features selected in the first step in terms of their significance in representing the interrelationships among the components. Based on the identified key features, a software component collection can be represented by a set of feature vectors which will be used for classification.
The classification scheme is based on the Nested Self-Organising Map (NSOM), a sophisticated neural network architecture. The NSOM-based classification scheme aims to tackle the problem of poor precision existing in current SOM-based general document classification systems. The NSOM-based storage structure is capable of facilitating a more effective two-level retrieval, a coarse-grained retrieval to improve recall and a fine-grained retrieval to enhance precision.
The retrieval mechanism enables the users to describe their queries in natural language without the need to understand the inner working of the library. As a result, user effort in formulating queries is reduced. Furthermore, a query refinement mechanism based on a relational thesaurus and a fuzzy-related thesaurus is provided to overcome the problem of ill-defined information need. The user can select an appropriate feature from the thesauri to expand the query or to replace an inappropriate query term to reformulate a more adequate query.
The browsing facility is provided as a complementary search mechanism. It is based on a meaningful and intelligible information search space derived from the NSOM based storage structure. In contrast to the hierarchical search space used in current software browsing systems where no navigation guidance is given, VSL's browsing facility can help the user predict where the desired component may be located.
To evaluate the VSL approach, a prototype system was developed and a sample library containing Unix system components was established as a test case. Results obtained from a retrieval experiment reveals that the established VSL-based library can achieve a better level of precision at the same level of recall in comparison with Guru, a software library system considered capable of achieving a better-than-average retrieval performance. The retrieval performance of the VSL-based library was also compared with a publicly available full-text retrieval system. A significant improvement in both recall and precision was obtained. The positive experimental results demonstrate the effectiveness of the VSL approach. A qualitative assessment of system efficiency and user-friendliness has also been made.
Details
- Title
- Toward a self-structuring software library: Nested self-organising maps for retrieving and browsing reusable software components
- Creators
- Huilin Ye
- Contributors
- Bruce Lo (Supervisor) - Southern Cross UniversityGitesh K Raikundalia (Supervisor) - Southern Cross University
- Awarding Institution
- Southern Cross University; Doctor of Philosophy (PhD)
- Theses
- Doctor of Philosophy (PhD), Southern Cross University
- Publisher
- Southern Cross University, School of Multimedia and Information Technology
- Number of pages
- xvii, 193
- Identifiers
- 991012961100102368
- Copyright
- © Huilin Ye 2000
- Academic Unit
- Faculty of Business, Law and Arts
- Resource Type
- Thesis