Intro to Generic Programming

In this short lecture we introduce a few core concepts used in programming. We will be using both R and python as examples, however, the concepts are transversal across all/most languages. The implementation details - i.e. how do you invoke a certain concept - will differ across languages.

How to install python.
Here is a nice introduction for Python novices.

Setup

Ideally you would try to run all commands in the below for both languages. I recommend that you open two terminal windows, one running R and one running python. For python we need the numpy package to demonstrate array support. Depending on how you installed python, there are different options.

  • Anaconda installation: conda install numpy
  • Homebrew or download from python.org : pip install numpy

Check whether the installation worked by doing

import numpy as np  
# loads the numpy library and gives it
# short name `np`

Variables

Variables are labels for objects. This can be simple numbers, or strings, but often also any other sort of object you could think of: a plot, a table, a matrix, a vector, a list, …

What is curious to know about variables is their scoping behaviour: where in our programs we can we see which variable? This differs quite importantly across languages and is something that requires some thought.

First, let’s create a variable x which holds the value 12.3:

x = 12.3
x + 5
17.3
x <- 12.3  # = works also
x + 5
[1] 17.3

Next, a function which will use the variables - here we do not provide x as an argument to the function, so which value will it use in each case?

def myfun(y):  
    return x + y  # must use `return`
# note the indentation!
# function definition finishes after last line of indented block.

myfun(8)
20.3
myfun <- function(y){
    x + y  # can use `return()`
}
myfun(8)
[1] 20.3

we see that in both cases, the function looked for the variable x in it’s calling scope, i.e. the environment where it was called from. This only worked because we had defined x before. This may or may not work in other languages. In general this is called lexical scoping.

Loops

If we have a repetitive task, it’s useful to be able to iterate, i.e. do the same thing to a potentially changing input. Consider that we had 4 numbers 2,3,4,5 and we wanted to print them to screen. We could do of course write 4 identical print statements, each with a different input:

print("this is number",2)
print("this is number",3)
print("this is number",4)
print("this is number",5)
print(paste("this is number",2))
print(paste("this is number",3))
print(paste("this is number",4))
print(paste("this is number",5))

but you can see that this a lot of repetitive code, which we want to avoid. Also, adding an additional number would mean a lot of extra work. So, loops are better here:

for i in range(2,5) :
    print(f"this is number",i) # note the indentation!
this is number 2
this is number 3
this is number 4
for (i in 2:4){
    print(paste("this is number",i))
}
[1] "this is number 2"
[1] "this is number 3"
[1] "this is number 4"

Useful Datastructures

concept Python R
1d list [1,2] c(1,2)
1d vector np.array([1,2]) c(1,2)
matrix np.array([row, col]) matrix(data,rows,cols)
n-d array np.array array
Dictionary dict list
DataFrame pandas.df data.frame

1-D list/vector

li = [1,3]
li + li  # not well defined vector space with `+` and `*`
[1, 3, 1, 3]
li = c(1,3)
li * li # element-by-element
[1] 1 9
li + li
[1] 2 6

in python we use the numpy package for linear algebra:

import numpy as np
li = np.array([1,3])
li * li  
array([1, 9])
li + li  
array([2, 6])
li = c(1,3)
li * li # element-by-element
[1] 1 9
li + li
[1] 2 6

Matrices

import numpy as np
ma = np.array([[1,3], [2,4]])
ma * ma
array([[ 1,  9],
       [ 4, 16]])
ma + ma 
array([[2, 6],
       [4, 8]])
ma = matrix(c(1,2,3,4),nrow = 2, ncol = 2)
ma * ma # element-by-element
     [,1] [,2]
[1,]    1    9
[2,]    4   16
ma + ma
     [,1] [,2]
[1,]    2    6
[2,]    4    8

N-D arrays

a = np.arange(1,9)
np.reshape(a, (2,2,2))
array([[[1, 2],
        [3, 4]],

       [[5, 6],
        [7, 8]]])
array(1:8,dim = c(2,2,2))
, , 1

     [,1] [,2]
[1,]    1    3
[2,]    2    4

, , 2

     [,1] [,2]
[1,]    5    7
[2,]    6    8

Dictionaries

Dicts are lists with a key -> value structure. Like a telephone book:

di = {'peter' : 1225, 'alice' : 4333}
di
{'peter': 1225, 'alice': 4333}
di = list(peter = 1225, alice = 4333)
di
$peter
[1] 1225

$alice
[1] 4333

DataFrames

In python, we use the pandas package for dataframe support. In R they are built-in as we know. There are many ways to create a pandas dataframe.

  • Here is the official pandas documentation.
  • in R, type ?data.frame for the help entry.
import pandas as pd
d = {"one": [1.0, 2.0, 3.0, 4.0], "two": [4.0, 3.0, 2.0, 1.0]}
pd.DataFrame(d)
   one  two
0  1.0  4.0
1  2.0  3.0
2  3.0  2.0
3  4.0  1.0
data.frame(one = c(1,2,3,4.0), two = c(4,3,2,1.0))
  one two
1   1   4
2   2   3
3   3   2
4   4   1

© Florian Oswald, 2024