- Project Runeberg -  Welcome to Project Runeberg
Front page | Next >>
Lysator Linköping University
  Project Runeberg | Like | Catalog | Recent Changes | Donate | Comments? |   
Project Runeberg (runeberg.org) is a volunteer effort to create free electronic editions of classic Nordic (Scandinavian) literature and make them openly available over the Internet. Projekt Runeberg (runeberg.org) arbetar på frivillig grund med att skapa fria elektroniska utgåvor av klassisk nordisk litteratur och göra dem öppet tillgängliga över Internet.

Project Runeberg, January 2023


January 2023

The world according to Poe

If you are tired of all the Henrik Ibsens and August Strindbergs of the turn of the century 1900, regularly using trains, telegraphs and even telephones, perhaps you prefer to take a step back to the first half of the 19th century and the time of Edgar Allan Poe? We have updated our presentation of this great American author and have found some of his texts in Swedish and Norwegian translation.

From this era, we have also added or improved the following titles:

Datasets

Are you a student or researcher of data mining, natural language processing (NLP) or artificial intelligence (AI)? Perhaps you want to train your own version of ChatGPT? Look no further, now you can download an entire Danish encyclopedia as a dataset.


October 2022

Ett encyklopediskt år

Under 2022 har vi på flera sätt utökat vår encyklopediska samling:

Ännu fler verk finns i vår lista över nyligen digitaliserade titlar.

Ukrainas sak är vår 2022

I november brukar vi påbörja årets insamlingskampanj, men i år har vi valt en annan inriktning. Ukrainas kamp för frihet är viktigare än att fylla på vår egen kassa (som ju också blev ganska full av förra årets insamling). Därför börjar vi redan nu i oktober att uppmana våra vänner att stödja Ukraina. Läs mer här.

Support Ukraine 2022

In November we usually start our annual fundraiser, but this year we take a different approach. Ukraine's fight for freedom is more important than filling our own account (which also got full enough from last year's fundraiser). This is why we already now in October start to encourage donations in support of Ukraine. Read more about this.


November 2021

Insamlingskampanj 2021/22

Från den 10 november genomförs vår insamlingskampanj för året. Det är den fjärde vi genomför och den görs likadan som de förra. En liten reklamskylt (banner, som ovan) syns på några av våra webbsidor, uppmanande till donationer med ett givet mål, 50.000 kronor för verksamhetsåret 2021/22. Bannern tas bort när målet uppnås, och återkommer nästa år. Sedan länge finns en länk "Donate" i sidhuvudet till alla våra webbsidor. Läs mer på vår sida för donationer.

2021/22 Fundraiser

Starting on November 10, a small banner (the one above) is seen on some of our web pages, promoting donations toward our aim of raising 50,000 SEK for the fiscal year 2021/22. The banner will be removed when the aim is reached, and will reappear next year. We have long had a link "Donate" in the header of all our web pages. Read more on our donation page.


December 2020

Danish novels

Project Runeberg aims to cover the literature of the Nordic countries or Scandinavia, but to be honest, most of our content is Swedish. This year, however, we have improved in the area of Danish novels by authors such as: Herman Bang, Carit Etlar (Carl Brosbøll), J. P. Jacobsen, Aage Madelung, Carl Møller, and Fanny Suenssen.

Our 28th anniversary

Project Runeberg was founded on December 13, 1992.

Looking forward to Public Domain Day

Copyright lasts for the author's lifetime and then for 70 full years (life+70), meaning that January 1st (Public Domain Day) is when it expires for those who died 70 whole years earlier. On January 1st, 2021, this happens to authors who died in 1950. So who are they? One easy way to find out is our list of Nordic Authors.

Among them are Nobel Prize winners George Bernard Shaw (1925) and Johannes V. Jensen (1944), but in the case of Shaw we also have to consider when the translators died. Other notable names are Harry Blomberg, Anna Branting, Edgar Rice Burroughs (Tarzan!), Ida Bäckman, Ewald Dahlskog, Ossian Elgström, Grenville Grove, Swedish king Gustav V, B. Rudolf Hall, Thorsten Jonsson, Martin Lamm, translator Wendela Leffler, Eva Neander, Ellen Nordenstreng, Oscar Olsson, George Orwell, Gösta Oswald, and Ester Ståhlberg.

Since we do digitize journals that are 70 years old, some of them already include articles by these authors, such as this 1946 article by Shaw about H.G. Wells in Bonniers litterära magasin, translated to Swedish by the journal's editor Georg Svensson (1904-1998).

Looking back at the 1949-ers

On January 1st, 2020, copyright expired for authors who died in 1949, including Nobel Prize winners Maurice Maeterlinck (1911) and Sigrid Undset (1928). By far, our greatest effort this year have gone into Sigrid Undset and Swedish writer Elin Wägner.

It is remarkable, that we have no works by Vilhelm Ekelund, one of Sweden's most prominent writers. He is, however, well represented at Litteraturbanken. By Anna Lamberg Wåhlin we have no books, but many translations and articles in the journal Ord och Bild that her husband edited.

We now have several works by Joel Haugard, Aage Madelung, and Axel Munthe, some by Ernst Enochsson, A. Stefan Gustafsson, Akke Kumlien, Ernst Newman, Håkan Theodor Ohlsson, and Ernst Westerberg, but none yet by Johan Harald Kylin, Siffer Lemoine, Gustaf Reinius, Ansgar Roth, Storm P., Nils Evert Taube, Eva Wahlenberg, or Karolina Widerström.

It was fun while it lasted,
but there is a time for everything.

by Lars Aronsson, December 2020

I founded Project Runeberg in December 1992, 28 years ago, and has managed it almost single-handedly since then. It was an early prototype of what the Internet could be used for, what a website could look like, how a collaborative volunteer (crowdsourcing) project could be organized. It has inspired others to start their own websites, it has inspired literature scholars and librarians to digitize books, it has inspired some aspects of Wikipedia, the free encyclopedia.

Some people will remember how I also started a wiki website, "susning.nu", in October 2001, how it was closed to editing in April 2004, and how it vanished entirely some time later. Project Runeberg has now reached a similar point where it is closed to contributions. Perhaps it will reopen later, but not in the same shape. Luckily, the risk of it vanishing entirely is much smaller, but should not be neglected.

1. To every thing there is a season,
and a time to every purpose under the heaven:
2. A time to be born, and a time to die;
a time to plant, and a time to pluck up that which is planted;
3. A time to kill, and a time to heal;
a time to break down, and a time to build up;
4. A time to weep, and a time to laugh;
a time to mourn, and a time to dance;
1. Allting har sin tid,
och vart företag under himmelen har sin stund.
2. Födas har sin tid, och dö har sin tid.
Plantera har sin tid, och rycka upp det planterade har sin tid.
3. Dräpa har sin tid, och läka har sin tid.
Bryta ned har sin tid, och bygga upp har sin tid.
4. Gråta har sin tid, och le har sin tid.
Klaga har sin tid, och dansa har sin tid.
Ecclesiastes 3:1–4 (KJV) Predikaren 3:1–4 (1917)

Project Runeberg has not been the same all of the time. It started out as a few text files on a Gopher and FTP server. The web (HTTP) server was added after about a year. When the project was six years old, I started to scan books as facsimile images of entire pages, instead of just presenting the resulting text. Some years later, just after the turn of the millennium, online proofreading through a wiki-like web form was added. Daily statistics of our growth date back to the fall of 2003. In 2005, Project Runeberg started to use UTF-8 characters for new books. The existing collection was converted to UTF-8 in 2012. Until circa 2010, the majority of books were scanned in black-and-white TIFF G4 format. Later, color JPEG has dominated.

From the beginning, Project Runeberg presented small poems and song texts. The first longer poems and complete novels came in the first years, as did the full text of the Bible in the Swedish translation of 1917. Among the first works in facsimile were the collected works in 14 volumes of Viktor Rydberg. The very first years of Wikipedia (and also susning.nu) coincide with the years (2001–2003) when I scanned the the Swedish encyclopedia Nordisk familjebok (two editions, 20+38 volumes, 1876–1926), which was followed in 2004–2008 by the Danish encyclopedia Salmonsens konversationsleksikon (26 volumes, 1915–1930). Later, new genres such as complete years of journals and more than a hundred dictionaries have been added.

The first decade of the millennium was also the time when Google announced their intention to scan many millions of books in a decade (Wikipedia: Google Books), followed by similar declarations from national libraries in France and Norway. It was clear, that book scanning was now a big thing, no longer an experiment. In Sweden, literature scholars started Litteraturbanken in 2004.

Wikipedia was growing more mature, and in 2007 I helped to organize the Swedish chapter of the Wikimedia Foundation, Wikimedia Sverige. I was a board member for the first five years (2007–2012). During this time I was also an active contributor to Wikipedia and some of its sister projects: Wikisource and Wiktionary. Wikisource is indeed a direct parallel to Project Runeberg, a book scanning and proofreading project. Maybe I could hope that Wikisource would replace Project Runeberg, just like Wikipedia had replaced susning.nu? I gave that thought a serious consideration in 2010–2011, but found it far easier to add and proofread books in Project Runeberg than in Wikisource. Wikisource is one project in Swedish, one in Norwegian and another one in Danish language, each having very few active contributors in 2020. Only larger languages like English, French, Italian and German have succeeded in building active communities of contributors.

If Project Runeberg were to continue after 2011, it would need to reinvent itself. Some of the software needed to be redesigned and a reliable source of funding would be necessary. To find out what could and needed to be done, I applied for and received a grant from the Swedish Internet Foundation. During 2012 I attended the annual Wikimania conference in Washington DC and also took the time to visit the Internet Archive's scanning center at the University of Toronto and to meet in New York with Greg Newby, head of Project Gutenberg. To my disappointment, there was far less direction and coordination in book digitization than I had hoped to find. It seemed to me that every project was working on the funding they could hope to find to just randomly digitize whatever books they could find, in the vague hope that someone would find them interesting to read at some later date. There was no demonstrable use or benefit from book scanning that could directly motivate investments. This was a depressing insight.

I gave up hope of finding reliable funding for a "real" project and instead purchased a new scanner with support from Wikimedia Sverige. I had collected some books that I could scan and I did so, just adding volume to the existing Project Runeberg, without rewriting any software. Several volunteers teamed up to help in this effort, both scanning books, importing books scanned by others, and proofreading. But nobody volunteered to improve the software or repair the broken web forum or wiki. This was the end of Project Runeberg's slow growth of 50,000 pages per year (2006–2011) and the beginning of 250,000 pages per year (2012–2020). While these numbers seem like a great success, it was still a continued period of stagnation in technical development.

(Was there a conflict of interest, when the board I had just left in 2012 gave me support to buy new equipment? I see it the other way around: I provided that organization with an opportunity to show how they supported successful scanning of books, that are useful to the organization's purpose, to improve Wikipedia, for a rather small amount of money. Both as a board member and when scanning books, I volunteered my time without salary or compensation. I'm more worried that nothing useful came from the grant I received from the Swedish Internet Foundation.)

In a few years at the beginning of the millennium, new technical functions had been added to Project Runeberg at a fast pace. Not only could we upload and proofread books, check the editing history, compare versions of a text page, and get daily statistics on growth. The codes or markup used when proofreading evolved into a language of its own, similar to HTML but not entirely, with its own syntax for table layout and poetry. There were also ways to index books, to edit the presentation of the books, to upload new versions of bad scans, to cut out illustrations from book pages and upload them separately. Added to this was our own wiki and a web forum. In all cases, these functions were implemented as rapid prototypes based on a quick idea, and never to any written specification, with any unit tests or with any security considerations.

I developed some of these functions, but not all of them. And my helpers soon left the project without documenting their features or their limitations. Some volunteer proofreaders learned how to use them, but nobody knew how to repair them when something went wrong.

Some proofreaders wanted to do more than the markup language could offer, and invented ways to use table layout code for things like centered headings, hanging indent and side margin notes. Well, isn't that great, it looks nice and solves the problem, doesn't it? The problem is that we are supposed to proofread the text of the book, and if a reader spots an OCR error, he or she should be able to correct the error. When opening the proofreading form, there should be the text and not a rat's nest of markup code for table syntax. Markup always needs to be minimalistic.

Among the security considerations left out is the ability to monitor and revert abuse. Wikipedia has developed very advanced features for this, and as a side effect they are also available to Wikisource. Pages can be locked for certain categories of editors, editors can be blocked from editing, any edit will be listed in an edit history, and can easily be reverted by an administrator. Project Runeberg has no such roles of editors, no hierarchy of administrators. Essentially, all edits are anonymous. Edits to text pages during proofreading are properly logged in a history and can be reviewed, but there is no quick revert function. Edits to scanned images are not logged at all. Who did what? Nobody knows.

For a project to continue like Project Runeberg in the 2010s, all volunteers must be careful and only use the existing functions with moderation. The project is very vulnerable to attacks. Intentional abuse can quickly get out of hand. This is what happened to susning.nu in 2003 and 2004, and led to the site being closed to editing. Fortunately, Project Runeberg has had no cases of intentional abuse, which is amazing. But in a handful of cases, there have been overly enthusiastic proofreading volunteers, who can't accept that some undocumented functions (separate uploading of illustrations) have stopped working or that they aren't allowed ot use table markup code to make prettier text pages, or who repeatedly do clumsy mistakes that can only be corrected by the only inside administrator (which is me). In most cases, a simple explanation has sufficed to correct them, but a few have been very stubborn. And for me, trying to lead and develop the project, supporting such users has taken an increasing fraction of my own volunteer time, which is why I decided on December 18, to close the project to editing.

We all have plenty of time to think of where this should go next. Readers can continue to read existing texts on Project Runeberg as long as the website stays open, hopefully for ever. Volunteer proofreaders will have to find a new hobby or move to some other project, perhaps Wikisource. Perhaps I will reopen some of the functions, such as simple proofreading, after first making sure that they can be monitored and reverted. But I am very reluctant to spend time on implementing a full security system with administration roles, locking and blocking.

But perhaps we need to take one step further back, and ask again why are we really scanning and proofreading books? Is there any real use or benefit to it? How can that benefit be measured, and is there a way to use it as a source of funding?

Some of our digitized works are used a lot, such as the encyclopedias I mentioned and some dictionaries. But does it matter that we also have indexed and proofread most of the text? I never hear any praise for this great effort, or lamentation that some other books are not yet proofread. The national library of Norway has during the 2010s digitized all Norwegian books, with good page images and rather good OCR text, but proofread none of it. And Norway is not in a deep crisis because of this. Maybe it's fine to just skip proofreading? The Swedish national library has digitized far fewer of its books, and Sweden is not in a deep crisis because of this. Maybe it's fine not to scan books? These are the key questions that have not yet been answered.

Project Runeberg has now completed three annual fundraisers, each time raising 25,000 SEK (US$ 2900), which is enough for buying new equipment and covering expenses, but insufficient for salary to administrators and software developers. Even if some software developers would volunteer their time and skill, there needs to be a coordinator that stays on the project and doesn't just disappear. I estimated in 2012 that a reasonable project with staff and equipment could be operated on an annual budget of 1–2 MSEK (US$ 120–240 thousand). From where could we get that kind of funding, and how would we write the motivation for it?


Project Runeberg, 2023-07-01 00:13 (runeberg)
http://runeberg.org/

Valid HTML 4.0! All our files are DRM-free