7
3

What's the best RDF Schema vocabulary or OWL ontology for expressing BibTeX bibliographies in RDF?

Ideally, the vocabulary would be actively maintained, have a decent spec, be supported by converters to and from BibTeX format, and in use at several sites that publish bibliographical data.

Extra points if it's easy to use with RDFa, for embedding into an HTML bibliography.

Edit: Jakob below asked about my use case, so here it is. My own publications currently live in at least four places: my BibTeX file, my homepage, my institute's homepage, and my university's library website. So a new publication is entered four times, by me and the institute webmaster and the university librarian. Once should be enough. I'm tempted to make the BibTeX file the primary representation, because there are good tools for managing that file, and find ways of deriving the other representations from it.

Edit: Candidates mentioned so far are:

  • BIBO — much more general than BibTeX, lots of useful properties
  • CiTO — lots of useful classes for academic documents
  • SWRC (see Paper with documentation) — outdated and unmaintained
  • bibTeX in OWL — outdated and unmaintained, but closest to BibTeX

Not RDF, but useful and with working implementations:

Potentially useful code:

  • JavaBib is a BibTeX parser in Java
  • Bibster is a desktop tool that imports BibTeX and exports RDF. It's from 2004. Source code (Java; I poked around and had trouble figuring out where the BibTeX parser is)

asked 08 Jun '10, 19:35

cygri's gravatar image

cygri ♦
9.0k412
accept rate: 34%

edited 10 Jun '10, 17:48


12next »

I do not understand you question. BibTeX is a specific data format but RDF is a general data structuring language. If you only want a one-to-one mapping than BibTeX in RDF is trivial:

@book{nelson1987,
  title = {Literary machines},
  publisher = {Mindful Press},
  author = {Theodor Holm Nelson},
  year = {1987}
}

Becomes

@prefix bibtex: <thisistheonlythinkyouarelookingfor>

:_ a bibtex:book ;
  bibtex:id "nelson1987" ;
  bibtex:title "Literary machines" ;
  bibtex:publisher "Mindful Press",
  bibtex:author "Theodor Holm Nelson" ;
  bibtex:year "1987" .

If you really want to convert BibTeX to RDF then the only thing you need is a namespace prefix!

But obviously you do not look for a simple encoding of BibTeX in RDF but for a general RDF ontology for bibliographic data that happens to be able to express most of what is possible in BibTeX. In this case you need to make clear your use case.

There are many Ontologies for bibliographic data and there surely will be more because all have their specific focus. What do you want to achieve with the data that you like to transform from BibTeX to some RDF ontology? If you want a "general bibliographic data format" then you will end up with something linke Dublin Core that will not be enough for many use cases. But if you have a more specific use case (for instance to create nice citations or to do bibliometric analysis etc.) then I could name some more suitable specific formats (most of them are already mentioned).

edit: I see no point in a one-to-one-mapping of BibTeX to BibTeX in RDF. The only useful thing that you could do with this BibTeX-Ontology could be creating BibTeX again but to make real use of the data you need to map it to another data model anyway - for instance you want to split the list of author names into single authors. I think SPARQL or similar RDF tools are not suitable for this kind of mapping and there already exist mapping tools from BibTeX to other formats.

I think you do not look for an RDF-based bibliographic format but for a tool to manage your bibliography. There should be a master file which other formats for different use cases can be derived from. Have a look at Zotero (Open Source client and public sync server), Mendeley (more functionality), BibSonomy (based on BibTeX and good connection to the computer science community but only online), or other reference managing software. You can easily manage your bibliography there and import and export the data via APIs in various formats - for instance BibTeX but also RDF-based formats - at least Zotero has an RDF-based export format, I don't know about the other ones.

All these programs are maintained and developed so more and better export is beeing added. One format that we will soon see in Zotero (you can already see parts of it in the source code) is the input format of the Citation Stylesheet Language (CSL). I just gave a lightening talk about CSL, see http://www.slideshare.net/nichtich/voss-elag-csl2010. I am also going to write an RDF serialization of this format (yet another bibliographic ontology). The existing CSL-processor citeproc-js is pretty cool.

In short: Either you do not want to deal with the details of mapping bibliographic formats - then just use a reference managing software. Or you have special needs and like to dig into mapping bibliographic formats - then you definitely need a mapping from and to one of the formats that are commonly used by reference managing software (BibTeX, COinS, MODS, BIBO...) - but in this case you could also directly invest your work in extending existing reference managing software by additional import/export and other tools.

link

answered 10 Jun '10, 14:20

Jakob's gravatar image

Jakob
1.9k211
accept rate: 10%

edited 10 Jun '10, 21:38

Thanks Jakob, good answer. I added a note about my motivation to the question. And as you know, it's part of the RDF ethos that you should re-use other people's terms whenever practical, and that terms should be well-defined and well-documented. So, indeed all I need is a namespace prefix; but I'm hoping that some kind soul out there has already defined this namespace, and set up the proper documentation for it, and maybe that namespace is even already used by some tools or data providers.

(10 Jun '10, 16:29) cygri ♦ cygri's gravatar image

The question was about rdf vocabularies for bibliographic information; so I'm not sure it's really fair to say that their question was really about applications for managing bibliographic citations...

(26 Jun '10, 16:51) Ed Summers Ed%20Summers's gravatar image

"I see no point in a one-to-one-mapping of BibTeX to BibTeX in RDF." One possible value of such a mapping would be the ability to get the more RDF-ish representation as a product of RDF transformation rules. E.g., superficial bibtex to RDF translation, then a more sophisticated "superficial RDF" to "good RDF" translation using SPARQL, SWRL, etc. There is value in a superficial translation if it gets data into a format that's easier to work with using a "standard" toolchain.

(27 Feb, 11:29) Joshua Taylor Joshua%20Taylor's gravatar image

Like a couple of people have said, it depends on what you need in the end. BIBO is by design a richer superset of BibTeX, and more idiomatic RDF. It reuses as much as possible of DC and FOAF, for example; as well as a bit of SIOC and MO. While it's certainly possible to encode it as RDFa (indeed, in the bib processor I'm working on, RDFa is the primary internal format), it's not going to be as "easy" as a simple flat format where everything is a literal property.

link

answered 10 Jun '10, 14:55

Bruce%20D%27Arcus's gravatar image

Bruce D'Arcus
411
accept rate: 0%

Thanks Bruce! Tool support trumps good modelling for me, so if I find something that can consume BIBO, or convert BibTeX to BIBO, then I don't really care if the RDFa markup is a bit more complex.

(10 Jun '10, 16:49) cygri ♦ cygri's gravatar image

BIBO and/or Cito might be good choices, although I don't think there are any existing converters to/from BibTeX.

link

answered 09 Jun '10, 01:36

Dan%20C's gravatar image

Dan C
19218
accept rate: 0%

Thanks Dan! BIBO definitely appears to be a crowd favourite.

(09 Jun '10, 07:07) cygri ♦ cygri's gravatar image

I like CiTO very much, and it is currently aligned with SWAN.

(09 Jun '10, 09:20) Egon Willigh... Egon%20Willighagen's gravatar image

We currently use SWRC on the SW Dog Food site. It does the job, seemed the best choice at the time and is modelled very closely to BibteX. We're not totally happy with it anymore, though, since it

  • lacks active community support,
  • lacks good online documentation,
  • lacks good hosting: the SWRC namespace http://swrc.ontoware.org/ontology is used in different versions of the ontology, but dereferences to an old version 0.3 instead of the newer 0.7.1.
  • does not cover some important modelling aspects such as ordered lists of authors and editors.

For these reasons, we are currently thinking about moving to BIBO, which has all of the above.

Regarding converters, I used bibtex2rdf a while back. Unfortunately, the source code is not publicly available, but I did get it by asking the author.

link

answered 09 Jun '10, 14:58

Knud's gravatar image

Knud
211
accept rate: 0%

Thanks Knud. bibtex2rdf uses JavaBib in turn, which looks like a good BibTeX parser. http://www-plan.cs.colorado.edu/henkel/stuff/javabib/

(09 Jun '10, 17:51) cygri ♦ cygri's gravatar image

Don't forget about Dublin Core. dcterms:title & dcterms:description is accepted by many consumers.

The only realistic way to create a reliable round trip is to use one that explicitly models the BibTeX structure. Such as; http://zeitkunst.org/bibtex/0.1/

Don't forget you can always overload your RDF. You could provide ALL the data in the bibtex predicates, then add dublin core & bibo where it's easy.

link

answered 09 Jun '10, 16:19

Christopher%20Gutteridge's gravatar image

Christopher ...
1.1k211
accept rate: 16%

Thanks Christopher. Yes, DC as the lowest common denominator certainly has a role to play here. Overloading would be possible, although ideally I wouldn't have to worry about this when publishing instance data (it should be taken care of by mappings in the BibTeX-specific vocabulary).

I hadn't seen the zeitkunst.org ontology before, but looking at it, it's the typical kludged-together-in-2004-using-Protégé stuff that I'd rather not use in the age of RDFa.

(09 Jun '10, 17:26) cygri ♦ cygri's gravatar image

Dear folks,

I have been alerted to this conversation very late in the day, so my apologies for revising an old string. However, since it mentions CiTO, let me comment.

These mentions of CiTO in earlier posts relate to my original version of CiTO (v1.6, described in J. Biomedical Semantics 1 (Suppl. 1): S6. http://dx.doi.org/10.1186/2041-1480-1-S1-S6), which contained properties for typing and counting citations, and classes for describing the objects of citations, e.g. "book" and "journal article".

In the second half of last year, Silvio Peroni and I cleaned up this mixed bag by splitting CiTO v1.6 into three separate and complementary ontologies:

1 CiTO, the Citation Typing Ontolopgy (http://purl.org/spar/cito/), containing only the original object properties used for citation typing, as used by Egon Willighagen in CiteuLike (http://chem-bla-ics.blogspot.com/2010/10/citeulike-cito-use-case-1-wordles.html) and Martin Fenner in WordPress blog posts (http://blogs.plos.org/mfenner/2011/02/14/how-to-use-citation-typing-ontology-cito-in-your-blog-posts/).

2 FaBiO, the FRBR-aligned Bibliographic Ontology (http://purl.org/spar/fabio/), that can be used for describing all things that are the objects of citations, from computer software via books and journal articles to blog posts. FaBiO has many elements in common with BIBO, but is richer both in extent and in being structured according to FRBR (http://www.ifla.org/en/publications/functional-requirements-for-bibliographic-records) into works, expressions, manifestations and items. BIBO is widely used and suitable for many purposes, but FaBiO's increased expressivity may be useful for describing things not covered by BIBO, and to avoid potential semantic confusion (e.g. between a research paper and its expression either in a conference presentation or in a research article).

3 C4O, the Citation Counting and Context Characterization Ontology (http://purl.org/spar/c4o/), that does what it says on the tin.

In addition, we created five other complementary and orthogonal ontologies for the bibliographic domain, covering for example the relationship between references and bibliographies, and the semantic and structural components of bibliographic documents. Together with CiTO, FaBiO and C4O, these form the SPAR (Semantic Publishing and Referencing) Ontologies described at http://purl.org/spar/.

Our current work involves using these ontologies as appropriate, in conjunction with Dublin Core, SWAN, Prism, FOAF, etc, to fully describe bibliographic and citation information in RDF.

We are also extending coverage to enable descriptions of and references to datasets, having mapped the DataCite Metadata Kernel to RDF (http://bit.ly/eOLN72), and with other colleagues we are developing best practice recommendations for citing and referencing published datasets.

To come back to the original topic of this post, the SPAR ontologies are suitable for mapping reference management metadata into RDF.

If you have questions about SPAR, you can contact me at david.shotton@zoo.ox.ac.uk and Silvio at speroni@cs.unibo.it.

link

answered 25 May '11, 07:48

davidshotton's gravatar image

davidshotton
211
accept rate: 0%

i have to point to ShaRef because that's the project i've been working on quite a while ago. it's XSD and pretty much derived from BibTeX, but better structured. however, it's not maintained, and the converters live in obscure java apps that probably are only used by one person on the planet these days (me), but there also is an online service. but still, it exists and is open source and easy to use.

project page (historical): http://dret.net/projects/sharef/

online converter: http://dret.net/bibconvert/

XSD: http://dret.net/bibconvert/xslt/sharef.xsd

example XML: http://dret.net/biblio/dret.xml

if there is a well-established RDFS or OWL, i might even give it a try to write some XSLT to convert my XML to RDF.

link

answered 08 Jun '10, 19:53

dret's gravatar image

dret
111
accept rate: 0%

Nice. The converter looks serious. Is the source available by chance?

(08 Jun '10, 19:58) cygri ♦ cygri's gravatar image

DBLP, a well known bibliography index for Computer Sciences know exposes references either in bibtex or in RDF.

For instance : http://dblp.l3s.de/d2r/page/publications/journals/ai/HendlerB10

The ontology used can be found at http://ontoware.org/swrc/, and is described in this paper http://www.aifb.kit.edu/web/Inproceedings1003

But there is no converter too...

link

answered 09 Jun '10, 06:31

YMombrun's gravatar image

YMombrun
513
accept rate: 0%

Nitpick: This RDF is not published by DBLP, but by a third party (Universität Hannover).

(09 Jun '10, 07:05) cygri ♦ cygri's gravatar image

But thanks anyway, YMombrun! SWRC is definitely a contender.

(09 Jun '10, 07:06) cygri ♦ cygri's gravatar image

Have you checked BibJSON and its export capabilities? Mainly they're using BIBO with some extensions. For examples, check with Jim Pitman (and if you do bug him to just post them rather than leaving dummy links in).

link

answered 09 Jun '10, 15:16

Jodi%20Schneider's gravatar image

Jodi Schneider
31418
accept rate: 29%

For completeness it is worth pointing out that even without the planned BIBO support Zotero imports wide range of formats including BibTeX and stores and exports its data as RDF. It fails your criteria in that "Zotero RDF" doesn't seem to have a standalone spec and isn't directly used for publication but is normally internal to the tool. They seem to use mostly DC with a bit of FOAF, PRISM and Biblio as the vocabularies. It's a great reference manager though.

link

answered 10 Jun '10, 17:45

Dave%20Reynolds's gravatar image

Dave Reynolds
3.1k311
accept rate: 46%

Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "Title")
  • image?![alt text](/path/img.jpg "Title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Tags:

×852
×98
×16
×2
×2

Asked: 08 Jun '10, 19:35

Seen: 5,645 times

Last updated: 27 Feb, 11:29