import numpy as np
# loads the numpy library and gives it
# short name `np`Intro to Generic Programming
In this short lecture we introduce a few core concepts used in programming. We will be using both R and python as examples, however, the concepts are transversal across all/most languages. The implementation details - i.e. how do you invoke a certain concept - will differ across languages.
How to install python.
Here is a nice introduction for Python novices.
Setup
Ideally you would try to run all commands in the below for both languages. I recommend that you open two terminal windows, one running R and one running python. For python we need the numpy package to demonstrate array support. Depending on how you installed python, there are different options.
- Anaconda installation:
conda install numpy - Homebrew or download from python.org :
pip install numpy
Check whether the installation worked by doing
Variables
Variables are labels for objects. This can be simple numbers, or strings, but often also any other sort of object you could think of: a plot, a table, a matrix, a vector, a list, …
What is curious to know about variables is their scoping behaviour: where in our programs we can we see which variable? This differs quite importantly across languages and is something that requires some thought.
First, let’s create a variable x which holds the value 12.3:
x = 12.3
x + 517.3
x <- 12.3 # = works also
x + 5[1] 17.3
Next, a function which will use the variables - here we do not provide x as an argument to the function, so which value will it use in each case?
def myfun(y):
return x + y # must use `return`
# note the indentation!
# function definition finishes after last line of indented block.
myfun(8)20.3
myfun <- function(y){
x + y # can use `return()`
}
myfun(8)[1] 20.3
we see that in both cases, the function looked for the variable x in it’s calling scope, i.e. the environment where it was called from. This only worked because we had defined x before. This may or may not work in other languages. In general this is called lexical scoping.
Loops
If we have a repetitive task, it’s useful to be able to iterate, i.e. do the same thing to a potentially changing input. Consider that we had 4 numbers 2,3,4,5 and we wanted to print them to screen. We could do of course write 4 identical print statements, each with a different input:
print("this is number",2)
print("this is number",3)
print("this is number",4)
print("this is number",5)print(paste("this is number",2))
print(paste("this is number",3))
print(paste("this is number",4))
print(paste("this is number",5))but you can see that this a lot of repetitive code, which we want to avoid. Also, adding an additional number would mean a lot of extra work. So, loops are better here:
for i in range(2,5) :
print(f"this is number",i) # note the indentation!this is number 2
this is number 3
this is number 4
for (i in 2:4){
print(paste("this is number",i))
}[1] "this is number 2"
[1] "this is number 3"
[1] "this is number 4"
Useful Datastructures
- python docs on data structures
- Article about R datastructures
| concept | Python | R |
|---|---|---|
| 1d list | [1,2] |
c(1,2) |
| 1d vector | np.array([1,2]) |
c(1,2) |
| matrix | np.array([row, col]) |
matrix(data,rows,cols) |
| n-d array | np.array |
array |
| Dictionary | dict |
list |
| DataFrame | pandas.df |
data.frame |
1-D list/vector
li = [1,3]
li + li # not well defined vector space with `+` and `*`[1, 3, 1, 3]
li = c(1,3)
li * li # element-by-element[1] 1 9
li + li[1] 2 6
in python we use the numpy package for linear algebra:
import numpy as np
li = np.array([1,3])
li * li array([1, 9])
li + li array([2, 6])
li = c(1,3)
li * li # element-by-element[1] 1 9
li + li[1] 2 6
Matrices
import numpy as np
ma = np.array([[1,3], [2,4]])
ma * maarray([[ 1, 9],
[ 4, 16]])
ma + ma array([[2, 6],
[4, 8]])
ma = matrix(c(1,2,3,4),nrow = 2, ncol = 2)
ma * ma # element-by-element [,1] [,2]
[1,] 1 9
[2,] 4 16
ma + ma [,1] [,2]
[1,] 2 6
[2,] 4 8
N-D arrays
a = np.arange(1,9)
np.reshape(a, (2,2,2))array([[[1, 2],
[3, 4]],
[[5, 6],
[7, 8]]])
array(1:8,dim = c(2,2,2)), , 1
[,1] [,2]
[1,] 1 3
[2,] 2 4
, , 2
[,1] [,2]
[1,] 5 7
[2,] 6 8
Dictionaries
Dicts are lists with a key -> value structure. Like a telephone book:
di = {'peter' : 1225, 'alice' : 4333}
di{'peter': 1225, 'alice': 4333}
di = list(peter = 1225, alice = 4333)
di$peter
[1] 1225
$alice
[1] 4333
DataFrames
In python, we use the pandas package for dataframe support. In R they are built-in as we know. There are many ways to create a pandas dataframe.
- Here is the official pandas documentation.
- in
R, type?data.framefor the help entry.
import pandas as pd
d = {"one": [1.0, 2.0, 3.0, 4.0], "two": [4.0, 3.0, 2.0, 1.0]}
pd.DataFrame(d) one two
0 1.0 4.0
1 2.0 3.0
2 3.0 2.0
3 4.0 1.0
data.frame(one = c(1,2,3,4.0), two = c(4,3,2,1.0)) one two
1 1 4
2 2 3
3 3 2
4 4 1
© Florian Oswald, 2024