# 網誌

## Python – Fitting a Linear Model by the Normal Equation

First, we have to prepare a data file. The data are generated with the following equation:

$y = 4.0+3.0X_{1}+X_{2}+\delta$

, where $\delta\sim N(0,1)$.

It is expected to get $\theta$ very close to $(4.0,3.0,1.0)$

# data.py
import numpy as np
import numpy.random as rnd

rnd.seed(2342)
X1 = 2.0*rnd.rand(100,1)
X2 = 2.0*rnd.rand(100,1)
y = 4.0 + 3.0*X1 + X2 + rnd.randn(100,1)

with open("data.dat","w") as f:
for i in range(len(X1)):
f.write("%f\t%f\t%f\n"%(X1[i],X2[i],y[i]))


rnd.randn(100,1) generates a $100\times1$ array of random numbers with standard normal distribution.

Plot the data by gnuplot:

set xlabel "x1"
set ylabel "x2"
set zlabel "y"
splot "data.dat"


Python implementation:

# normal_equation.py
import numpy as np
import numpy.linalg as linalg

# Create X_0 term
X0 = np.array([[1.0 for _ in range(len(data))]])
X1 = np.array([[ele[0] for ele in data]])
X2 = np.array([[ele[1] for ele in data]])
X = np.concatenate((X0,X1,X2),axis=0).T
y = np.array([[ele[2] for ele in data]]).T
theta = np.matmul(linalg.inv(np.matmul(X.T,X)),np.matmul(X.T,y))
print(theta)


np.matmul(X,Y) performs the matrix multiplication of 2 matrices X and Y, and linalg.inv(X) returns the inverse of the matrix X.

stdout: [[4.08673693], [2.82390453], [1.13030646]]

We can view the best fitting plane by typing this in gnuplot:

f(x,y) = 4.08673693+2.82390453*x+1.13030646*y
replot f(x,y)


## Fitting a Linear Model by the Normal Equation

We want to find a set of $\theta_{i}$ which best fits the equation:

$y=\theta_{0}+\theta_{1}x_{1}+...+\theta_{n}x_{n}$

, given the m training examples’ input values are

$\begin{bmatrix}-(x^{(1)})^{T}-\\ -(x^{(2)})^{T}-\\ \vdots \\ -(x^{(m)})^{T}-\end{bmatrix}$

, which is a $m\times(n+1)$ matrix.

Notice that It is $n+1$ rather than $n$, since we may have a non-zero intercept term, so we add $x^{(i)}_{0}=1$ to every training example.

And the target values are

$\begin{bmatrix}y^{(1)}\\ y^{(2)}\\ \vdots \\ y^{(m)}\end{bmatrix}$

, which is a $m\times1$ vector.

We want to find a set of $\theta_{i}$ such that $J(\theta)=\frac{1}{2}(X\theta-\vec{y})^{T}(X\theta-\vec{y})$ is minimized.

Perform the matrix derivative to $J(\theta)$:

$\nabla_{\theta}J(\theta)=X^{T}X\theta-X^{T}\vec{y}$

Set the derivative to zero, hence

$\theta=(X^{T}X)^{-1}X^{T}\vec{y}$

## sqlite3 (1) – Programming in an object-oriented style

SQLite is a C library that provides a lightweight disk-based database that doesn’t require a separate server process and allows accessing the database using a nonstandard variant of the SQL query language. sqlite3 is a Python package that provides a SQL interface.

To program in an object-oriented style:

import sqlite3

class Database:
def __init__(self,db):
"""
Connect to database or create a database if not exists.
db: database file name.
"""
self.conn = sqlite3.connect(db)
self.cursor = self.conn.cursor()

def __del__(self):
"""
Close connection to database.
"""
self.conn.close()

def dbcommit(self):
"""
Write changes to database.
"""
self.conn.commit()


When an object is instantiated, it is connected to the database. db is the database file name, such as “stock_quotes.db”. If the file does not exist, a new database file will be created.

Reference:

## Fortran – Programming Primitive Vectors of Reciprocal Lattice

Suppose $\vec{a_{1}}$, $\vec{a_{2}}$ and $\vec{a_{3}}$ are the 3 primitive vectors of a direct lattice. The primitive vectors of the reciprocal lattice are given by

$\vec{b_{1}}=2\pi\frac{\vec{a_{2}}\times \vec{a_{3}}}{\Omega_{a}}$

$\vec{b_{2}}=2\pi\frac{\vec{a_{3}}\times \vec{a_{1}}}{\Omega_{a}}$

$\vec{b_{3}}=2\pi\frac{\vec{a_{1}}\times \vec{a_{2}}}{\Omega_{a}}$

, where $\Omega_{a}$ is the volume of the primitive cell.

Fortran 90 code:

program main
implicit none
real(8), parameter :: PI=4.D0*datan(1.D0)
real(8),dimension(3) :: a1,a2,a3,b1,b2,b3
real(8),dimension(3) :: a,b
real(8) :: Va

print *,"Enter x, y, z coordinates of a1:"
print *,"Enter x, y, z coordinates of a2:"
print *,"Enter x, y, z coordinates of a3:"
Va = dot_product(a1,cross_product(a2,a3))
b1 = cross_product(a2,a3)*2*PI/Va
b2 = cross_product(a3,a1)*2*PI/Va
b3 = cross_product(a1,a2)*2*PI/Va
print *,"b1(1)=",b1(1),"b1(2)=",b1(2),"b1(3)=",b1(3)
print *,"b2(1)=",b2(1),"b2(2)=",b2(2),"b2(3)=",b2(3)
print *,"b3(1)=",b3(1),"b3(2)=",b2(2),"b3(3)=",b2(3)
print *,"Volume of Unit Cell=",Va
contains
function cross_product(x,y)
implicit none
real(8),dimension(3) :: x, y, z, cross_product
z(1)=x(2)*y(3)-x(3)*y(2)
z(2)=x(3)*y(1)-x(1)*y(3)
z(3)=x(1)*y(2)-x(2)*y(1)
cross_product=z
end function cross_product

real(8) function dot_product(x,y)
implicit none
real(8),dimension(3) :: x,y
dot_product = x(1)*y(1)+x(2)*y(2)+x(3)*y(3)
end function dot_product
end program main


The user inputs the 3 direct lattice vectors, and the program will print the 3 reciprocal lattice vectors and the volume of the direct lattice.