## Presentation made the 12/11/2019: - Version control with git - Markup languages
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

8.3 KiB

Document Version Control with GIT

Before we start…


What is version control?

  • Version control is a system that records changes to a file or set of files over time so that you can recall specific versions later.

  • Has existed for almost as long as writing has existed (ex. document version)

  • Today, the most capable (as well as complex) revision control systems are those used in software development.


  • Revert files back to a previous state

  • "Freeze" important versions of a document

  • Compare changes over time

  • Track progress of a project

  • See who modified something, and when

Modern version control systems

  • Remote backup of files

  • Powerful tool for collaboration


  • Developed by Linus Torvalds in 2005

  • The linux Kernel:

    • ~63000 files

    • Roughly 15,600 developers from more than 1,400 companies


  • Free and open source

  • Distributed

  • Powerful and flexible

  • Learning curve can be steep


xkcd git

How does it work?



Package managers are heavily recommended!

Creating a remote repository

  • register at the remote git server

  • create repository

  • add participants ssh public keys

  • clone the repository in your machine

README and .gitignore

Every repository should have these 2 files:

  • README: project description and useful information

  • .gitignore: special file indicating GIT which files are not to be tracked


git workflow

copying remote repository: clone

  • git clone repository

  • Clones the remote repository into the local one

staging changes (local)

  • git add files

  • Adds the changes into the local staging area

Saving changes: commit (local)

  • git commit "message"

  • Saves the changes in the staging area into the repository

  • Creates a "snapshot" of the current state of one or more files

  • A message describing the changes must be provided

history and revert (local)

  • git log files

  • returns a history of the file modifications

  • git revert commit

  • removes one or more commits from the local files, changes must be committed after

upload to remote repository: push

  • git push

  • Uploads the state of the local repository to the remote one

Download from remote repository: pull

  • git pull

  • Fetch and merges the documents in the remote repository into the local one

  • Merging files can generate conflicts, git will ask us to fix them and commit the changes


version control flow

other (advanced) stuff

  • tags

  • partial reverts

  • change history

Docs as code

  • Software is a small part of the documents a project must handle

  • Still, version control and remote collaboration are needed for all the documents

  • In the last years there is a big push of treating documents the same way as programming files


  • Working in plain text files (rather than binary file formats like Word)

  • Collaborating using version control such as git and GitHub

  • Storing docs in the same repositories as the programming code itself

  • Versioning docs through git tags/releases (rather than duplicating all the files to archive each release)

  • Generate other formats or websites without modifying the document

Just a little problem…

  • The most common document formats: word, pdf… are binary files

  • git (text based) doesn’t work with them


  • Markup languages:

  • Markup languages are ways of annotating an electronic document.

  • Usually markup will either specify how something should be displayed or what something means.

    • html, xml, latex…

Markup languages

  • Documents are written in plain text, then a program convert them into the final document

  • The same document can be used to generate files in other formats: latex, word, pdf or even slides

  • Formating is done by the computer, output is always consistent

  • Fast and light

  • Can be used in version control systems

Markup languages: Advanced features

  • Automatic generation of documents

  • Inline comments (not rendered in the final document)

  • Split one the document into several. Ex: main document, chapters and bibliography

  • Code executed and plots rendered in the document


  • Extensively used for technical papers

  • Beautiful generated documents

  • Very powerful…

  • and very heavy

  • Setup and document customization are complex

Latex: example



\title{Introduction to \LaTeX{}}
\author{Author's Name}


The abstract text goes here.

Here is the text of your introduction.

    \alpha = \sqrt{ \beta }

\subsection{Subsection Heading Here}
Write your subsection text here.

    \caption{Simulation Results}

Write your conclusion here.


Latex: example II


\section*{Notes for My Paper}

Don't forget to include examples of topicalization.
They look like this:

\enumsentence{Topicalization from sentential subject:\\
\shortex{7}{a John$_i$ [a & kltukl & [el &
  {\bf l-}oltoir & er & ngii$_i$ & a Mary]]}
{ & {\bf R-}clear & {\sc comp} &
  {\bf IR}.{\sc 3s}-love   & P & him & }
{John, (it's) clear that Mary loves (him).}}

\subsection*{How to handle topicalization}

I'll just assume a tree structure like (\ex{1}).

\enumsentence{Structure of A$'$ Projections:\\ [2ex]
    & \node{i}{CP}\\ [2ex]
    \node{ii}{Spec} &   &\node{iii}{C$'$}\\ [2ex]
        &\node{iv}{C} & & \node{v}{SAgrP}


Mood changes when there is a topic, as well as when
there is WH-movement.  \emph{Irrealis} is the mood when
there is a non-subject topic or WH-phrase in Comp.
\emph{Realis} is the mood when there is a subject topic
or WH-phrase.


Latex alternative: Lyx

  • WYSIWYG latex editor

  • Documents are generated in .lyx, a subset of latex

  • Can be used together with version control

  • Provides, by default, templates for many of the biggest scientific journals

Lyx: example


Lyx: example II


Lightweight Markup languages

  • Also called Plain Text Markup or humane markup language

  • Provide a way of formating the document, while still being readable

  • Widely used on websites and code documentation

LML: current options

  • Markdown

  • reStructuredText (rst)

  • Asciidoc


  • Created for minimal formating of web text

  • used everywhere: web, jupyter notebooks, r-markdown…

  • There is no standard, currently exist many flavours of it (github, commonmark, pandoc)

  • Originally not intended for documents, very limited

  • Different flavors and tools try to overcome this limitation

    • (+ pandoc)

Markdown: example

quicktourexample small


  • Developed for book creation.

  • Limited number of users

  • Standardized and extensible, great documentation

  • Lack of resources makes that bugs or request take time to be fixed


  • Originally intended for python documentation

  • medium sized but very tech-savvy community

  • Syntax is a little different than the other two

  • Very powerful and extensible

Which one to use?

  • Notetaking:

    • Markdown

    • Asciidoc

    • reStructuredText

  • Anything more serious:

    • reStructuredText

    • Latex/Lyx


choco install git vscode pandoc