Using a flash based document viewer to display Pdf content in MediaWiki

Sometimes working with a pdf introduces some hurdles, even more when people don’t have a pdf reader at hand. We wanted a solution which was independent from pdf format, widely used and still a workable environment for people using MediaWiki.

We initiated the development of a small MediaWiki extension that would allow us to display content of uploaded pdf’s in a flash based document viewer.

The basics

The extension does make use of existing open source software such as SWFTools (which helps to convert pdf’s on the fly into the swf format) and a flash based document viewer Flexpaper (released under GPL). Some technical information about the project  can be found at the github project page.

The technical inside

Proof of concept

When a pdf file is viewed in MediaWiki, the extension checks if a swf file version of the pdf file is available and if not it would initiate rendering process. If a file is found it would use the flexpaper viewer to display the content of the file within the same namespace as the pdf.

Does it work

The extension is in an early alpha stadium, nevertheless the SWF rendering integration and document viewer functionality works and can be seen in the video below.


Advanced search functionality in Mediawiki

For as long as we know Mediawiki’s search interface is as rudimentary as it can gets and full-fills only basic needs. We acknowledged this fact a long time ago but we hope that with a refreshing focus (based on Wikipedia research)  things can get better and we can see some real improvements that allow to combine search patterns, combine external searches (with other web services), or even let emerge faced searches etc.

We always wondered why websites likes EBSCO or ProQuest that handle high volume data allow advanced search features but non of those has reached our beloved Special:Search page. The most advanced feature that one can work with is to eliminated search terms from  namspaces and if one is aware of “intitle:”, “incategory:”, “prefix:” can limit search results.

We would not ask for much and definitely not for a holy grail but things like adding different search terms with different options would be a start and should be standard within a website like Mediawiki.

Instead of being bound by a basic search input field, a bit more selectivity for the advanced searcher and researcher who uses such database on regularly basis would be nice.

External Search Engines

Mediawiki’s internal search is certainly a no-go solution and therefore another area of improvement is in how external search engines can interact with Mediawiki. Since SphinxSearch 0.8+ the extension is in compliance with MW’s Special:search page and a more comfortable solution than Lucene where it seems one has to pass a technical expertise course to get it up and running.

Helping extensions to access more meaningful search input and presenting search results is without saying a pressing matter.

This slideshow requires JavaScript.

See also

The only tweak we found so far that makes search life in Mediawiki more satisfying is the fact that MW’s internal pages MediaWiki:Searchmenu-new and MediaWiki:Searchmenu-exists allows the display additional guidance. Enhancing those with some templates allows to display SemanticMediawiki information without any core development see also Combining full-text search and semantic search in an one-step-process.

An open buzilla report on the matter of advanced search functionality.

Notes

We are no experts nor developers and merely state our experience therefore we can’t give any advice beyond the description mentioned above and in future versions those information might be obsolete. We see information supplied here only as to share experience which can vary with different setup’s, people are encourage to make improvements beyond what is been described here. Screen dumps are for illustration and educational purpose only, and do not imply any copyright infringement. 


Using Semantic Internal Objects to build dynamic Book gallery

Browsing books by its title and author is a standard feature off any library catalogue but sometimes one is animated to browse a book by its cover (visual rather textual search). Using Semantic Mediawiki can be help with this task when data sets about books are stored in a Mediawiki system.

The achieve this goal of a browsable book gallery, we use the help of Semantic Internal Objects (SIO) to store the data about objects that should be browsable. The selection process is supported by Semantic Forms (SF) RunQuery to define selection criterias (subject or book type) together with the help of Semantic Results Format (SRF) a gallery (standard Mediawiki)  is displayed containing only the selected objects (book covers).

By solely combing those components a dynamic gallery (as seen below) can be generated without any additional programming effort.

Video

How to …

For specific help on how to employee SIO[0], SF[1,2], and SRF, please see the links at the end. Looking at the book gallery, cover images are stored in the File: namespace[3] and together with a template that includes the setting for Semantic Internal Objects, the system stores required data every time the object (File page) is changed.

Solution

Most of the work is done by the embedded SIO manipulation in the File: namespace template, connected with a form ensuring that certain data structures and policies are followed to minimize the impact of unrelated content storage.

Storage (File namespace template)
{{
#set_internal:Is cover image for
 | Has image={{FULLPAGENAME}}
 | Is gallery member of=Book
 | Has caption=[[{{{1}}}|{{#show:{{{1}}}|?Title|link=none}}]] ({{#show:{{{1}}}|?Author|link=none}})
 | Has gallery category#list={{#show:{{{1}}}|?Keyword|link=none}}
 | Has object type#list={{#show:{{{1}}}|?Type|link=none}}
}}

Selection and display (RunQuery template)

The selection query (seen below) is used in a RunQuery template to generate the gallery output.

{{#if:{{{type|}}}|[[Has object type::{{{type|}}}]]}}{{#if: {{{keyword |}}}|[[Has gallery category::{{{keyword|}}}]]}}
| ?Has image
| ?Has caption
| format=gallery
| imageproperty=Has image
| captionproperty=Has caption
| link=none
| perrow=6
| offset={{{offset|0}}}
| limit={{{steps|5}}}
}}

The combination of the above modules, with some div and css tweaks allowing the display of a book gallery as seen in the first sceen dump.

Help

[0] Storing Semantic Internal Objects

[2] Using Semantic Forms to query data

[3] How to use templates, forms within a namespace

[3] http://www.mediawiki.org/wiki/Manual:Namespace

Note

We deliberately choose not to query external data (such as covers etc.) in order to save bandwidth and structure data according to our internal policies.

We are no experts nor developers and merely state our experience therefore we can’t give any advice beyond the description mentioned here and in future versions changes might not work any longer. We see the information supplied here only as to share experience which can vary with different setup’s, people are encourage to make improvements beyond what has been described here.


Using jQuery to change font size in Mediawiki article

We were looking for some usability improvements and found that sometimes it can be of advantage to change the font size of an article during reading without having to edit the article itself. The new MediaWiki 1.17 allows us easily to insert some jQuery javascript code that will handle the change of font size within seconds. Read the rest of this entry »


Combining full-text search and semantic search in an one-step-process

Sometimes you don’t really know what your looking for and therefore some help from the SemanticMediawiki stored properties/values might posse an indication in where/what to look for. Normally you don’t have the two searches in one place which decrease efficiency and willingness  to open another window to start a semantic search.

MW’s internal pages MediaWiki:Searchmenu-new and MediaWiki:Searchmenu-exists allowing to display some additional information (run some simple #ask queries without having to define a on-the-fly query itself) while running a standard full-text search.

MW Standard Search, SphinxSearch, and SMW tagcloud

The above example shows, the standard search results are displayed on the left side while the right side showssome additional information such as a Tagcloud derived from SemanticMediawiki.

How it is done?

We expect most searches are connected to some kind of keyword related search therefore we define the #ask query with that in mind and predetermine a template SMWKeywordSearch that includes such statetment:

{{#ask: [[Keyword::{{{1|}}}]]
|format=tagcloud
|limit=5
}}

The only thing that is left is to call this particular template SMWKeywordSearch within the page of MediaWiki:Searchmenu-exists. So every time the full text search runs through Searchmenu-exists it includes the #ask query and in case it finds a combination of Keyword::{search term} it will list those results automatically.

The #ask example is very straight forward but it demonstrates that one can combine full-text and semantic search in an one-step-process. One remark that is left is that one should carefully draft queries that are necessary and needed by the audience, limit the display of the result displayed (in our case limit=5 while their are more results available one will have to possibility to list them on a different screen, as a standard SemanticMediawiki behaviour) and do not make to many fancy enhancements that dilute display performance.

An enhanced display as above is the result of allowing the #ask results to be encapsulated in a <div>.

Notes

We are no experts nor developers and merely state our experience therefore we can’t give any advice beyond the description mentioned above and in future versions those information might be obsolete. We see information supplied here only as to share experience which can vary with different setup’s, people are encourage to make improvements beyond what is been described here. Screen dumps are for illustration and educational purpose only, and do not imply any copyright infringement. 


Using Semantic Forms RunQuery to switch RSS Feeds in seconds

Keeping afloat with news from around the world is an important task for any research department, managing various news and information channels can become difficult but with a bit of help from Mediawiki, information filtering can be improved. Read the rest of this entry »