数字图
馆新技术展望数字图书馆新技术展望
林林 夏夏
20062006年年88月月
www.pdffactory.com
新技术还是新观念?新技术还是新观念?
§§ 从从1.0 1.0 到到 2.02.0
––技术更新了技术更新了
––观念的变化更大。观念的变化更大。
§§ 从从DSPACE DSPACE 到到 。。。。。。
––我们刚开始使用我们刚开始使用DSPACEDSPACE,,
––但但DSPACEDSPACE的下一代产品已经出来了。的下一代产品已经出来了。
––它给我们的启示是什么?它给我们的启示是什么?
www.pdffactory.com
Seeking a Paradigm Shift Seeking a Paradigm Shift ……
§§ On digital collectionsOn digital collections
–– From documents to documents with dataFrom documents to documents with data
–– From repositories to learning communities From repositories to learning communities
–– From digital libraries to cyberFrom digital libraries to cyber--infrastructure infrastructure
§§ On digital organizationsOn digital organizations
–– From a standardFrom a standard--driven metadata to a communitydriven metadata to a community--based, based,
useruser--driven social classification. driven social classification.
–– From a structured, regulated information space to a From a structured, regulated information space to a
dynamic, selfdynamic, self--organizing space organizing space
–– From tagging to semantic digital librariesFrom tagging to semantic digital libraries
www.pdffactory.com
Trend 1: Trend 1: From document collections From document collections
to document to document ““data collectionsdata collections””
§§ TextText
–– HighWireHighWire PressPress
–– BioMedBioMed Central Central
–– eScholarshipeScholarship RepositoryRepository
§§ NonNon--TextText
–– NASA Image ExchangeNASA Image Exchange
–– Open Video Open Video
§§ Data Data
–– Science Science –– CODATACODATA((Committee on Data for Committee on Data for
–– Science and Technology Science and Technology ))
–– Social Science Social Science –– ICPSRICPSR (The Inter(The Inter--university university
Consortium for Political and Social Research)Consortium for Political and Social Research)
www.pdffactory.com
Trend 2: From repositories to Trend 2: From repositories to
learning communities learning communities
§§ Learning Object RepositoriesLearning Object Repositories
§§ Learning Object Standard Learning Object Standard
–– SCORE SCORE –– Sharable Content Online Resources Sharable Content Online Resources
for Educationfor Education
–– SCORM SCORM –– Sharable Content Object Reference Sharable Content Object Reference
ModelModel
§§ Learning CommunitiesLearning Communities
–– IMS Global Leaning ConsortiumIMS Global Leaning Consortium
–– Learning System Learning System ---- CORDRACORDRA
www.pdffactory.com
Trend 3: From Digital Libraries to Trend 3: From Digital Libraries to
CCyberyber--infrastructure infrastructure
§§ Creations of a technological foundation for Creations of a technological foundation for
significant discovery, synthesis, and dissemination significant discovery, synthesis, and dissemination
brought about by the Information Revolution.brought about by the Information Revolution.
§§ Establishment a culture of modern research in all Establishment a culture of modern research in all
the areas: sciences, engineering, and social the areas: sciences, engineering, and social
sciences, and humanities. sciences, and humanities.
§§ Integration of technology, content, and practice. Integration of technology, content, and practice.
www.pdffactory.com
新一代的数字图书馆新一代的数字图书馆
§§ 必须建立在新的观念上必须建立在新的观念上
–– Understand the nUnderstand the nature and characteristics of ature and characteristics of
digital informationdigital information
§§ Digital libraries are established on the networked Digital libraries are established on the networked
information environmentinformation environment
–– Study Study emergent behaviors of links or emergent behaviors of links or
relationships among information entitiesrelationships among information entities
§§ Organize information not only by concepts but also Organize information not only by concepts but also
by links or connections. by links or connections.
–– Put people and community first Put people and community first
§§ Users are digital libraries contributors. Users are digital libraries contributors.
§§ Users collaborate through digital libraries Users collaborate through digital libraries
www.pdffactory.com
一门新的学科一门新的学科
§§ Digital Information Organization (DIO)Digital Information Organization (DIO)
–– 建立在三个学科的基础上建立在三个学科的基础上
§§ The Science of Networks The Science of Networks
–– How everything is connected to everything else and we can still How everything is connected to everything else and we can still
make sense of it. make sense of it.
–– What are the characteristics of the information space? What are the characteristics of the information space?
§§ Library and Information ScienceLibrary and Information Science
–– How information is defined and represented How information is defined and represented
–– How information seeking activities can be defined in a cognitiveHow information seeking activities can be defined in a cognitive
framework framework
§§ Cognitive ScienceCognitive Science
–– What is know about the mind and how it processes informationWhat is know about the mind and how it processes information
–– What do people do with information? What do people do with information?
www.pdffactory.com
Selected ReadingsSelected Readings
§§ Network ScienceNetwork Science
–– BarabasiBarabasi, A. (2003). Linked: The New Science of , A. (2003). Linked: The New Science of
Networks. Networks.
–– Watts, D. (2003). Six degrees: The science of a Watts, D. (2003). Six degrees: The science of a
connected age. connected age.
§§ LISLIS
–– IngwersenIngwersen, P.; & , P.; & JJääevelinevelin, K. (2005). The Turn: , K. (2005). The Turn:
Integration of information seeking and retrieval in contextIntegration of information seeking and retrieval in context
–– SvenoniusSvenonius, E. (2000). The intellectual foundation of , E. (2000). The intellectual foundation of
information organization. information organization.
§§ Cognitive ScienceCognitive Science
–– LakoffLakoff, G., & , . (1990). Woman, fire, and dangerous , G., & , . (1990). Woman, fire, and dangerous
things: what categories reveal about the mind.things: what categories reveal about the mind.
–– Lamberts, K., & Shanks, D. (Eds.). (1997). Knowledge, Lamberts, K., & Shanks, D. (Eds.). (1997). Knowledge,
concepts, and categoriesconcepts, and categories
www.pdffactory.com
Networked information environmentNetworked information environment? ?
www.pdffactory.com
The New Science of NetworksThe New Science of Networks
§§ The study of complexity of connectivity. The study of complexity of connectivity.
§§ The study of emergent behaviors of physical, The study of emergent behaviors of physical,
biological, and social networks.biological, and social networks.
§§ Some essential findingsSome essential findings
–– Power law distribution Power law distribution
–– Small Worlds Small Worlds
§§ Six Degree of Separation Six Degree of Separation
§§ Hubs and Connectors Hubs and Connectors
–– ScaleScale--free networks free networks
§§ The 80/20 rule The 80/20 rule
§§ Rich Get richer Rich Get richer
–– Hierarchies and communities Hierarchies and communities
www.pdffactory.com
The Fundamental QuestionThe Fundamental Question
§§ The information problem The information problem today today is moving is moving
from the lack of information to information from the lack of information to information
overload. overload.
§§ What are the characteristics of such a shift?What are the characteristics of such a shift?
§§ What changes will the shift bring to the information What changes will the shift bring to the information
environment? environment?
§§ How do we adapt to such a shift?How do we adapt to such a shift?
§§ We need new knowledge representation We need new knowledge representation
methods in todaymethods in today’’s networked information s networked information
environment. environment.
www.pdffactory.com
My exploration: My exploration:
Dynamic Knowledge RepresentationDynamic Knowledge Representation
§§ Digital Information Organization needs to be:Digital Information Organization needs to be:
–– DynamicsDynamics
–– NetworkedNetworked
–– UserUser--initiated initiated
–– CommunityCommunity--based based
–– SelfSelf--organizedorganized
www.pdffactory.com
Comparison of DIO with ThesaurusComparison of DIO with Thesaurus
OrganizationOrganization--basedbasedCommunityCommunity--basedbased
The indexing space is The indexing space is
independent of the independent of the
document spacedocument space
Integration of the Integration of the
concept space and the concept space and the
document spacedocument space
Based on constructions Based on constructions
of human expertsof human experts
Based on Large scale Based on Large scale
data analysis data analysis
Static structuresStatic structures (such as (such as
hierarchy, crosshierarchy, cross--references)references)
Dynamic structuresDynamic structures
ConceptConcept--basedbasedNetworkNetwork--based based
ThesaurusThesaurusDIODIO
www.pdffactory.com
TaggingTagging
§§ Tagging is very popular this yearTagging is very popular this year
–– ItIt’’s a kind of s a kind of social bookmarkingsocial bookmarking
–– ItIt’’s a kind of social classifications a kind of social classification
–– ItIt’’s a new s a new information organizinginformation organizing methodmethod
§§ UserUser--initiated initiated
§§ ConnectionConnection--basedbased
§§ CommunityCommunity--basedbased
§§ It has a lot of features of (future) DIOIt has a lot of features of (future) DIO
–– ItIt’’s not there yet.s not there yet.
–– WhatWhat’’s missing?s missing?
–– What can be improved? What can be improved?
www.pdffactory.com
Problems of TaggingProblems of Tagging
§§ Users are free to use any words or terms Users are free to use any words or terms
as tags.as tags.
§§ Users are free to assign different Users are free to assign different
meanings to the tags they use.meanings to the tags they use.
§§ Tags are not semantically connected. Tags are not semantically connected.
–– No semantic structure? No semantic structure?
§§ We need to bring semantics back to the We need to bring semantics back to the
tags! tags!
www.pdffactory.com
Semantic Digital LibrariesSemantic Digital Libraries
§§ Enhance current digital libraries throughEnhance current digital libraries through
–– integrate information based on different metadataintegrate information based on different metadata
–– interfaceinterface
–– ontology languagesontology languages
–– ontologyontology--based search/facet searchbased search/facet search
–– communitycommunity--enabled browsingenabled browsing
§§ Enforce the transition from a static information Enforce the transition from a static information
space to a dynamic (collaborative) knowledge space to a dynamic (collaborative) knowledge
space.space.
www.pdffactory.com
SimileSimile
§§ SSemantic emantic IInteroperability of nteroperability of mmetadata and etadata and
iinformation in nformation in ununLLikeike EEnvironments nvironments
–– ““Leverage and extend Leverage and extend DSpaceDSpace, enhancing its , enhancing its
support for arbitrary schemas and metadata, support for arbitrary schemas and metadata,
primarily though the application of RDF and primarily though the application of RDF and
Semantic Web technology. Semantic Web technology. ““
–– W3C, HP, MIT W3C, HP, MIT 正在合作开发一系列新系统正在合作开发一系列新系统
§§ 加强加强DSPACEDSPACE的原数据的功能的原数据的功能
§§ 将将 Semantic Web Technologies Semantic Web Technologies 应用于数字图书馆应用于数字图书馆
www.pdffactory.com
SIMILESIMILE
§§ Tools for Metadata ManagersTools for Metadata Managers
–– Gadget Gadget -- XML inspectorXML inspector
–– RDFizersRDFizers -- Batch tools to transform existing XML data Batch tools to transform existing XML data
into RDFinto RDF
–– Solvent Solvent -- FirefoxFirefox extension for extension for JavascriptJavascript screen screen
scrapingscraping
–– Welkin Welkin -- Graphical tool to inspect/edit RDF graphGraphical tool to inspect/edit RDF graph
§§ Tools for EndTools for End--UsersUsers
–– LongwellLongwell -- WebWeb--based RDF faceted metadata browserbased RDF faceted metadata browser
–– Piggy Bank Piggy Bank -- FirefoxFirefox extension for personal information extension for personal information
management of metadata in RDFmanagement of metadata in RDF
–– Semantic Bank Semantic Bank -- WebWeb--based server that allows data based server that allows data
publishing and sharing by individuals, groups, or publishing and sharing by individuals, groups, or
communitiescommunities
www.pdffactory.com
BRICKSBRICKS
§§ BBuilding uilding RResources for esources for IIntegrated ntegrated CCultural ultural
KKnowledge nowledge SServices ervices
–– Advanced open source software solutions for Advanced open source software solutions for
the sharing and the exploitation of the sharing and the exploitation of digital digital
cultural resourcescultural resources..
–– Access to distributed available information Access to distributed available information
resourcesresources
www.pdffactory.com
BRICKSBRICKS
§§ Content ManagementContent Management
–– store content internally or reference content stored anywhere elstore content internally or reference content stored anywhere elsese
–– implementation based on Java Content Repository (JCR) / implementation based on Java Content Repository (JCR) /
JackrabbitJackrabbit
§§ Metadata ManagementMetadata Management
–– support for various metadata schemas defined in OWLsupport for various metadata schemas defined in OWL--DLDL
–– bibliographic records in RDF; query on records in SPARQLbibliographic records in RDF; query on records in SPARQL
–– implementation based on Jena Semantic Web Frameworkimplementation based on Jena Semantic Web Framework
§§ Collection ManagementCollection Management
–– organize content items in hierarchical structures (or folders)organize content items in hierarchical structures (or folders)
§§ Annotation ManagementAnnotation Management
–– collaborative aspect of BRICKScollaborative aspect of BRICKS
–– annotate images or image parts with text or links to other itemsannotate images or image parts with text or links to other items
www.pdffactory.com
JeromeDLJeromeDL
§§ Digital library build on semantic web Digital library build on semantic web
technologies to answer requirements from: technologies to answer requirements from:
librarians, scientists and everyone.librarians, scientists and everyone.
§§ A social semantic digital library makes A social semantic digital library makes
use of Semantic Web and Social use of Semantic Web and Social
Networking technologies to enhance both Networking technologies to enhance both
interoperability and usabilityinteroperability and usability
www.pdffactory.com
www.pdffactory.com
Conclusions Conclusions
§§ 新的观念正在推动新技术的发展新的观念正在推动新技术的发展
––数字图书馆的内涵还在扩展。数字图书馆的内涵还在扩展。
––数字图书馆与数字图书馆的使用必须同步发展。数字图书馆与数字图书馆的使用必须同步发展。
––用户应当参与数字图书馆内容与组织的建设。用户应当参与数字图书馆内容与组织的建设。
–– From Tagging to Semantic Digital Libraries From Tagging to Semantic Digital Libraries –– 数数
字化信息的组织建设正在酝酿着一场新的革新。字化信息的组织建设正在酝酿着一场新的革新。
www.pdffactory.com