Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Sioyek – PDF viewer for reading research papers and textbooks (github.com/ahrm)
238 points by hexomancer on Feb 19, 2022 | hide | past | favorite | 81 comments
Some of the features:

* Quickly preview or jump to figures/references/equations/etc. (even if the PDF doesn't have links)

* Search paper names in google scholar by middle clicking on their name

* Searchable table of contents

* Searchable highlights/bookmarks

* Browser-like history navigation

* Mark locations for quick navigation (Vim style)

* Synctex support

Video demo of some features: https://www.youtube.com/watch?v=yTmCI0Xp5vI




Previous discussion from 7 months ago: https://news.ycombinator.com/item?id=27893303


For research work I prefer LiquidText on a 12.9” iPad with an Apple pencil. It’s critical for me to make notes and annotations.

It’s crazy how far pen computing has come. I was an early adopter of this as a college student back in the 2000’s. I had a Toshiba laptop which had a screen that would rotate around and fold on itself to become a tablet. It had a pen for writing and ran a special version of Windows designed for tablets. (https://the-gadgeteer.com/2006/04/27/toshiba_portege_m200/)

There weren’t many good sources of PDFs so I made my own. I would take my textbooks to Kinkos where they had an industrial paper cutter that was able to slice off the binding, leaving me with a bunch of loose pages. I would use a double sided auto-feed scanner to scan all the pages into searchable PDFs.

Sadly the technology wasn’t ready for prime time. The software wasn’t good enough, drawing was limited and not many apps took advantage of the pen. The hardware was heavy, bulky, and I always had to be around an outlet because the battery life was abysmal.

An iPad + pencil is a truly remarkable experience compared to that previous setup.


> It’s critical for me to make notes and annotations.

What notes and annotations do you usually make? I can understand notes when you're at a lecture discussing a paper, but my experience is that the lecture rarely pertains to one specific paper.

I don't really take a lot of notes, so I wonder if I'm doing something wrong.


It depends what you are reading and why. Check out “How to Read A Book” by Mortimer Adler. The book is quite old but still extremely relevant. (He’s one of the best philosophers/thinkers I’ve ever read BTW. I suggest “How To Think About The Great Ideas”, but that’s for another topic.)

On the subject of note taking a professor once told me “don’t take notes, make notes”. Note taking is basically writing down what the author (lecturer) says. In college so many people would just blindly copy what the professor wrote on the board. That is note “taking”. Note “making” is adding you’re thoughts: what is the author saying (in my words) and do I agree with the ideas? What would other authors think, are there counter points, supporting examples, etc. Is this point elaborated elsewhere in the book? What terms, words, or concepts need more explanation?

I found this blog (https://fs.blog/how-to-read-a-book/) it has a good summary of Adler’s ideas. For example, on using the margins in a book:

    When you buy a book, you establish a property right in it, just as you do in clothes or furniture when you buy and pay for them. But the act of purchase is actually only the prelude to possession in the case of a book. Full ownership of a book only comes when you have made it a part of yourself, and the best way to make yourself a part of it— which comes to the same thing— is by writing in it.

    Why is marking a book indispensable to reading it? First, it keeps you awake— not merely conscious, but wide awake. Second, reading, if it is active, is thinking, and thinking tends to express itself in words, spoken or written. The person who says he knows what he thinks but cannot express it usually does not know what he thinks. Third, writing your reactions down helps you to remember the thoughts of the author.

    Reading a book should be a conversation between you and the author. Presumably, he knows more about the subject than you do; if not, you probably should not be bothering with his book. But understanding is a two-way operation; the learner has to question himself and question the teacher. He even has to be willing to argue with the teacher, once he understands what the teacher is saying. Marking a book is literally an expression of your differences or your agreements with the author. It is the highest respect you can pay him.


Thanks for the thoughtful reply.

That blog is indeed have good inspectional notes about Adler's book :-)


I just write my opinions on the margin, or general thoughts, especially something new that can stem from there?

I mostly write connections that I make while reading a paper.


I don't take notes, but use a pen and paper as a scratch pad. If the paper has some math, I can sketch it out and try to break it down, allowing me to understand it better. Or if there's a weird concept, I can draw it out and visualize the issue.

It's also useful when the information is really dense and there are certain things I need to go back and recall but don't want to search my short term memory for.


Could you please tell me what generation is your 12.9” iPad? I am thinking about getting an older model for annotating papers, taking notes, and to use as a whiteboard during Zoom calls. I am trying to figure out what is the oldest model that would be appropriate for these tasks. Would the first gen be performant enough?


I have a 2nd generation, which has the home button. (I got my wife the 1st generation when it was first introduced and it’s still functioning. The battery doesn’t charge well but it’s still good for surfing the web and playing games.)

Apple changed the pencil. I wouldn’t get any iPad that wouldn’t support the new pencil, which I think is from the 3rd generation when the removed the home button and introduced rounded corners.

The old pencil has a lightning connector and plugs into the iPad to charge. The new pencil magnetically attaches to the iPad and charges.

Apple should release the next generation soon. (Maybe in a couple weeks.) Hopefully you can get a used one at a good rate when that happens.


The only thing missing is full-fledged Zotero, or something as good!


It would be great to reflow the document in single column mode, not to scroll back from the bottom of the left one to the top of the right one.

Actually, if these document are mostly consumed on screen, just write them single column, half of an A4 or Letter, or plain HTML. Do they keep creating PDFs because publishers sell paper journals?


I prefer PDFs hundred times over plain HTML, both for research papers and books. HTML can only reproduce proper structuring with great pain and it is essential for most scientific content. I rather scroll sideways over not having the essential equation displayed correctly.


PDFs are much better to distribute. You get a consistent rendering on any device, which is particularly important for mathematical expressions, tables, and figures. I don't want to navigate the zillion libraries I can use to do this in HTML, which will break anyway because the viewer won't have the right version. PDF works just fine for scientific papers.


In my field, almost no one really buys paper journals. I think it's mostly a matter of tradition at this point. However, PDF does also mean the author gets control over what the document looks like. Of course it's possible to produce similar quality work using HTML, but many authors and publishers are just so accustomed to the existing stack that a lot of it is tradition. Fortunately, it's becoming fairly common for IEEE to provide HTML versions of papers generated from the LaTeX source that are single column. They're not perfect, but they can be a nicer reading experience in some cases.

Personally, if I'm doing any serious paper reading, I'm often doing it on my reMarkable so I like a nicely formatted PDF.


I haven't tried this yet, but the idea that I can click on a reference and have a window pop up to show me the equation/figure/table without actually taking me there is awesome. I've been trying to figure out how to do this pop-up thing in latex so it would do this in regular pdf viewers but the current methods I've seen are all too clunky. To be able to do this purely in the viewer is fantastic.


I usually read PDFs in Firefox, with two tabs open for the same document. I use the second tab to scroll around without loosing context.

This also possible to achieve in Emacs with pdf-tools, and with just one instance of the document, by setting marks and following hyperlinks.


evince does that when you hover over links.


Which, if anyone is not aware, is the default gnome pdf viewer. Evince is pretty good software -- simple and does the job well.

However, evince only does it if the references are actual (hyperref) links.


The preview feature, in particular, is exceptionally important. I cannot overstate what a difference it makes when reading deeply cross-referenced material.


This looks great. It was obviously created by someone who reads technical papers, and figured out how to make the process more convenient. Going to install as soon as I finish enjoying the videos—which are also well done.


A nice feature would be the ability to say from which page, page counts starts.

Many books label the first page of chapter 1 as page 1, and the preceding pages I… XI …

My solution is to split the pdf into 3

- before page 1 - main content - index (everything after main content)

This makes it possible to type the page number and go to that page.


We have this feature! See https://github.com/ahrm/sioyek/issues/86.


Since this topic is likely to garner people who do or use research I hope to find an answer to the burning question of why on Earth don't researchers put a freaking publication year in their paper. Sometimes it's virtually impossible to learn if the paper is 20- or 2-years old. For something that matters quite a lot I always find this "tradition" a bit nonsensical.


I don't know, but usually searching for the paper title online, e.g. in google scholar or (for CS papers) on dblp, will turn up the venue where it's published, including publication date. In fact, if it's in dblp, it will show the year right on the results page.


Thank you for replying but it still doesn't explain this bizarre practice.


One reason is the lag between a paper being "online first" vs. being included in an issue. I've seen as much as three years (paper published in its accepted form in 2017 but not included in an issue till 2020).


How can the researchers be 100% sure in what year the paper will get published? I imagine that at best they could put the year of submission into the paper, but that's about it.


You mean in a preprint or something? Because in any published version, of course, the year is right there.


I've never seen a date on a single academic paper available online (ie. from arxiv)


I have, quite often. But, for example on arxiv, the date is on the page that lists papers and that has the links to the various versions of the paper. And, in case the author has neglected to include the date in the PDF, arxiv adds the date, embedded in the PDF, vertically at the left of the first page.

Added: But you’re right that authors often neglect to include date on the the manuscript or preprint versions, and this is a problem. I guess because journals often don’t want this, as they will add the submission and publication dates.


Are the notes and highlights stored in the PDF? Asking because I use Zotero in several machines and I need annotations and highlights to be synchronized without much hassle.

Zotero is quite close to provide a PDF reader itself (available in the Beta release, AFAIK), but nice to see alternatives with academic documents in mind.


No, however, we have a command which exports a version of PDF file with embedded notes and highlights.


Thanks a lot for explaining this. Having such feature is definitely useful.


This looks really neat. I hope to make use of it.

I noticed a minor error in the tutorial under Basics: "scrolling down half of screen width" should be something like "scrolling down half the window height". Animating the scrolling would make it much less disorienting.

Is there a way to see a list of marks? Slices of the screen like the reference preview would be great, but just a list of defined mark names and page numbers would be useful.

I'd love proper touch support. It's so much faster for zooming in on figures or scrolling through pages quickly.

I often view PDFs with two pages side by side in Adobe Reader or Sumatra, and that's great for looking for things (move through documents with left and right arrows). I'd miss that.


Currently there is no way to see list of marks but you can see the list of bookmarks.


Installed using the zip file, and it works great. Fantastic job!

But I thought you’d like to know that I couldn’t get it to compile on a Debian system. I followed your instructions, but found I also had to install

    libglu1-mesa-dev
    libxi-dev
    libxrandr-dev
and then the build script failed with

    Project ERROR: Unknown module(s) in QT: core gui sql opengl widgets quickwidgets 3dcore 3danimation 3dextras 3dinput 3dlogic 3drender openglextensions
I have QT installed, but I couldn’t figure out which packages are missing.


You may need to install q3d-dev and qtdeclarative5 (the exact package names might be different for your distribution). Also see this: https://github.com/ahrm/sioyek/issues/97#issuecomment-962556...


Thanks! And thanks for making and sharing sioyek, and all your responses here.


I have been using this as the PDF viewer on Windows for few months now, and like it. Thanks for creating this. I like the fast bookmark lookup (t+start typing), vim like bindings. Dark mode also works well. Real estate is maximally used for the content.

There were couple of features that I missed - and needed to fire up a different viewer: 1. View PDF properties 2. Enter a slide-show mode, where a full page is shown and arrow keys advance (not scroll) the page.


We added slide-show mode a while back: https://github.com/ahrm/sioyek/issues/52


What about notes? Is there a way to add notes to highlights? It would be nice if they are also displayed on the side.


The bookmark feature is basically notes, though they are not currently displayed on the side.


This looks great. Now I need it in emacs. pdf-tools could probably emulate some of these features with enough work…


What I like most about this is the the idea of clicking a reference to display a floating preview of the concerning paper information, figure or table. That seems like a really nice solution!

Any idea which other PDF viewers implement this behavior?


If you're on Linux, GNOME Document viewer (aka evince) does this when you hover over a link. It doesn't work for non-linked references afaik though.


Skim.app also does this


Skim is awesome. A great, unassuming little app that keeps doing its job perfectly 10 years later.


Thanks! Running `brew install --cask skim --no-quarantine` right now!


Pdfs often have huge margins, so a feature I'd like to have is soft crop - just for viewing, without modifying the file, like goodreader has. Although it's not as important on desktop as it is on a tablet.


So I use Moonreader on Android and it preserves the zoom level as you scroll/flip through pages so it kind of does this. Not sure if that's relevant for you.


What would be the benefit of using this compared to something like Sumatra?


I used to use sumatra myself, in fact before I started developing sioyek, I tried to add some of the features to sumatra here: https://github.com/sumatrapdfreader/sumatrapdf/pull/869 .

Anyway, I don't think sumatra has these features:

* Marks

* Preview links

* Jump to figures if the PDF doesn't have links

* Portals

* Searchable highlights

And some other features which are shown in the video and github page. Also sioyek is available on linux and macos.


Does it fit page correctly according to each page size, rather than the max page size? This is one bug in zathura that bothers me.


Yes, there is a "smart fit page" shortcut which automatically fits to screen width as you scroll.


It'd be nice if it would automagically find the PDF's BibTex corresponding entry...


I wonder how far from Linux-equivalent of Marginotes we are. Seems like a lot


Does this work on Linux?


It does


[deleted]


From a world-class research university, I hold a Ph.D. in math and am pretty good with D. Knuth's math word processing software TeX.

I don't like trying to read PDF files of math on a computer screen: Typically the fonts are way, Way, WAY too small unless I magnify the display of the file a LOT, but then the lines are WAY too long to fit on the screen forcing me to use the horizontal scroll bar as the main effort in reading the math. Or put another way, in PDF files of math documents, nearly always there are WAY too many characters per line. The situation seems to be that the journal wanted to save on paper and ink!!!! When I develop a document with TeX, I use the TeX commands to magnify the fonts by a LOT. Bluntly, without a screen at least four feet wide with maybe 16,000 pixels per line, reading the usual PDF file of math is a PAIN. So, sure, I'm eager for better ways to read PDF files of math.

My reactions to this OP (original post):

(1) I have no idea what is meant by a "middle click", nor would I have any idea where to look up the meaning. There might be a rule in technical writing -- never, ever, but NEVER, not even once in a whole career, on risk of horrible pain, e.g., a barbed wire enema, use terminology that is not very, VERY, essentially universally, well understood without explanation or at least a reference. This rule would also apply to acronyms.

(2) For searching a PDF file of math, I have no idea how I would type in the math expression to be searched for. Maybe the software is accepting TeX syntax -- I can guess that even that approach would have problems.

(3) For the video, the text is far, far too small to read and goes by far too fast to get any information at all.

Broadly I can't make any useful sense out of the OP at all.

I'm eager for better ways to read PDF files of math papers, and maybe there is some good work and good utility here, but I have zip, zilch, zero understanding of what is being attempted or how it would work -- nichts, nil, nada, none.

Uh, this is not nearly the only place where some technical material could use better technical writing.


> nor would I have any idea where to look up the meaning

https://www.google.com/search?q=middle+click

There. Does your "world-class research university" block the use of search engines?


Goes to show you someone can get a PhD in math, do research at a "world class institution", not know what a middle click is, and still be a dick about your technical writing. Does wonders for y'alls imposter syndrome eh?


Technically yes, but the vast majority of us have met him before. He, or someone like him, teaches Cal I to first-years (because he's gotta teach something beyond his fourth-year five-student moon-man seminar on his inscrutable research focus). Forgive him; he learned TeX before mouses had a scroll-wheel. I had to work with this prototypical fellow once, when I was already out of university; his abilities -- and inabilities -- were stupefying, in both senses of the word.


Just think, in 20 years, we'll be bemoaning the neural interface being unusable because it doesn't work just like computers did in 2005-2015ish and we don't get it and kids these days with their jargon like "middle think" being confusing.

Hell, it's already started with "the kids" being all-in on bizarre shit like using Discord for everything, and, well, anything about crypto, NFTs, unironically having Internet-capable dishwashers or, indeed, almost anything to do with mobile apps.


> I have no idea what is meant by a "middle click", nor would I have any idea where to look up the meaning.

Clicking the middle button on a mouse. Searching for [middle click] on Duck Duck Go or Google would tell you what it is.


So, you did a search at some of the search engines and found some explanations for the computer jargon "middle click".

Here is a point: Before doing such search, can't be sure will find a good answer. That was my point: There was no way for me to know before doing such a search that such a search would be successful. That is, the search engines are not guaranteed glossaries for all technical jargon in all technical fields.

So, my explanation that I didn't know where to find a description of "middle click" was correct -- before trying a search at some search engines quite literally we did not "know" where we would find an explanation. Now that such a search has been successful, sure, we do know, but that fact is a bit weak as justification for using undefined jargon.

So, since the search engines are not comprehensive glossaries for all technical jargon in all technical fields, given some technical jargon, a user has to make in effect shots in the dark, has to make these shots dark just to read a description of some product.

I suggest that product developers should avoid asking their audience to make such shots in the dark just to read their product descriptions.


"Middle click" is primary school level computer knowledge, not obscure technical jargon. I didn't need to do the searches to know they would work, and people writing technical software don't need to avoid such basic terms.


> "Middle click" is primary school level computer knowledge,

While I spend well over 40 hours a week at a keyboard looking at a screen driven by Windows with data mostly from the Internet, I have not heard of or used the mouse middle click in years.

The terminology "middle click" just isn't very useful and hasn't been very common for maybe 20 years.


Thanks for your feedback.

1. I meant mouse middle click, I think this is pretty standard terminology.

3. You are right, the video could be a lot better. You are not really supposed to read the texts though, the main purpose is to showcase the features.


> 1. I meant mouse middle click, I think this is pretty standard terminology.

Well, for a lot of people, e.g., a big fraction of the HN audience, sure, okay.

But: Yup, way back when personal computers were starting to use a mouse, there was no "middle click". Then there was, as a step forward -- I remember that. Now here is a surprise: While I spend well over 40 hours a week at a keyboard looking at a screen driven by Windows with data mostly from the Internet, I have not heard of or used the mouse middle click in years. The main reason is that on Windows, at least with the options I have, all the middle click is good for is a fast version of vertical scrolling on the contents of a window on the screen, and I don't find that fast scrolling to be useful. So, I have just forgotten about middle clicking. I'm up on a LOT of old stuff in computing, a lot of it from way back before middle clicking, but not middle clicking -- I just didn't find it useful. Here I'm using my experience as evidence for a

Lesson: In making assumptions about what is "pretty standard" for your users, it is tough to be accurate and for a solution work to err on the side of assuming less, a lot less.

There is one more point: It is totally obscure to me how fast scrolling could play much role in tracing references in PDF files of math. Maybe the Windows Win32 API permits programs to treat the middle click and scroll wheel any way it wants. If so, then that is something else tough to assume users know and be accurate. E.g., from my experience, I've written a lot of software and a lot for Windows .NET but so far I've never written any code that directly called the Win32 API. So, I have no good knowledge on what might be in Win32 for middle clicking.

Lesson: "pretty standard terminology", I wouldn't assume that.

More generally I claim that computing is being badly hurt by way too little in definitions for way too much technical jargon.

There is a simple solution: When in doubt, and even if not, work to define or at least give a reference for technical jargon. E.g., here at HN (Hacker News) apparently (although I've seen no statement) OP abbreviates "original post".


I first used a computer mouse as an adult over 30 years ago and have been 'middle-clicking' for about 28 of them.


This PDF reader look beautifully done. It will undoubtedly make reading papers more convenient. I have a trackpad with two buttons. How to I make a middle click? I’m sure the answer is in the documentation, but I’m being lazy, since you’re here.


It depends on your laptop. For my laptop it's both buttons at the same time.


Thanks. I’ll try that after I install it, which will be after I finish watching your very nice videos.


On a Mac laptop (or trackpad), it’s clicking with two fingers.


This is incredibly valuable feedback directly from the intended audience, provided for free. Awesome!

A middle click is the middle mouse button, typically clicking with the scroll wheel these days. I don't use a Mac touchpad/mouse but Google says it can be done with a triple finger tap, possibly requiring a setting to be enabled under Accessibility.

I'm glad tools exist to find more details about terminology that I don't already know. It's always a balance to decide what needs explaination and what a reader probably already knows.


My guess for the meaning of a "middle click" was to click on a reference somewhere in the middle of the text of the reference, say, after the authors but before the journal.

My second guess would be something having to do with a "middle finger".

Click on the middle mouse button? Okay: With an Amazon mouse, can just press down on the scroll wheel. With an HP laptop with Windows 10, I just did that: What I got was a circle with a dot in the middle, above the dot an up arrow and below the dot a down arrow. Research is still in progress in an attempt to discover how this might be used to search for, say, the Radon-Nikodym theorem! Or for something simple

y(t) = { y(0) b e^{bkt} \over y(0) \big ( e^{bkt} - 1 \big ) + b}

Maybe could do a middle click on an iPhone and then use a scanning, tunneling electron microscope to read the screen image -- give me a few minutes to set that up!!!


> Or for something simple

> y(t) = { y(0) b e^{bkt} \over y(0) \big ( e^{bkt} - 1 \big ) + b}

TeXmacs finds that, by the way. I just tested it. And each piece of it: I tested the denominator.


Clicking your scroll wheel is the middle mouse button.


Incidentally, have you tried the WYSIWYG TeXmacs word processor? (It is inspired by TeX but does not use it.)

Check out this video:

https://www.texmacs.org/tmweb/home/videos.en.html


Thanks. Have not heard of TeXmacs. Will look at the URL.

A lot of people could like WYSIWYG for anything like TeX. Thousands times more people like and expect WYSIWYG.

The TeX I use has 100+ TeX macros I wrote.

I type the TeX input using my favorite software tool, the one I use for nearly all my typing, my favorite text editor KEDIT. It is a Windows program version of the editor XEDIT for IBM's operating system VM/CMS and written on his own time by an IBM guy in Paris. For KEDIT I have 100+ macros and write new ones frequently. E.g., KEDIT has some good string manipulation tools, and, e.g., I can parse some files, rip out the data, then pull it into, say, Excel and draw a graph or pull it into some compiled language, Fortran, C, C++, C#, etc. and do some analysis. Sometimes the analysis is statistics, but typically I derive my own statistical methods and program and use those, methods not in the usual statistical packages, SPSS, R, etc. That is, I want to work at the level of the data and some code I wrote and not try to use higher level software intended to be easier to use.

To me, I want to work at the right level for me. E.g., I don't use LaTeX -- regard it as at too high a level. I don't want their macros and don't want to learn them or fight them. Knuth's TeX documentation is very well written; I find the LaTeX documentation less well written and much longer -- bummer.

Some of my KEDIT macros are to help in typing TeX files. E.g., a neighbor gave me a cake for Valentine's Day, and I just typed a letter of thank you, in TeX, using KEDIT, and using a KEDIT macro to insert the current date in the TeX syntax I wanted. Then I have a KEDIT macro that runs the spell checker, Aspell, I got with my TeX software, and do spell checking -- also used that for the Valentine's Day letter. And I like Aspell much better than any WYSIWYG spell checking -- Aspell does better at figuring out how to correct badly misspelled words and one reason for that is that I have my own spelling dictionary additions. Also since I use just Aspell for nearly all my spell checking, I get to use my one dictionary for all my work -- that is, the dictionary should be particular to me and not separately to email, software, letter writing, etc. To address the envelope for the letter, sure, I used KEDIT for that.

Yup, lots of people like WYSIWYG and, if they were to use TeX or anything like it, would definitely want WYSIWYG.

Uh, going way back, for anything like typing, especially for TeX, writing email, writing software, I like KEDIT and don't just dislike WYSIWYG but deeply, profoundly, bitterly hate and despise WYSIWYG and nearly all it values and attempts. In the simplest terms, I can't program WYSIWYG input, that is automate the input.

Not everyone likes WYSIWYG!


I am surprised that you think that you can't program WYSIWYG input: I even suspect that the thought depends on habit of thinking, and not on consideration of the matter on its own merits.

I did not check the things that you do---described in your message above---one by one, but I expect you can do all of them using TeXmacs, as it is completely programmable using Scheme (which of course I like much more than a macro expansion language ;-) ).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: