網誌

Python – Fitting a Linear Model by the Normal Equation

See: https://riverplus0programming.wordpress.com/2017/07/21/fitting-a-linear-model-by-the-normal-equation/

First, we have to prepare a data file. The data are generated with the following equation:

y = 4.0+3.0X_{1}+X_{2}+\delta

, where \delta\sim N(0,1).

It is expected to get \theta very close to (4.0,3.0,1.0)

# data.py
import numpy as np
import numpy.random as rnd

rnd.seed(2342)
X1 = 2.0*rnd.rand(100,1)
X2 = 2.0*rnd.rand(100,1)
y = 4.0 + 3.0*X1 + X2 + rnd.randn(100,1)

with open("data.dat","w") as f:
    for i in range(len(X1)):
        f.write("%f\t%f\t%f\n"%(X1[i],X2[i],y[i]))

rnd.randn(100,1) generates a 100\times1 array of random numbers with standard normal distribution.

Plot the data by gnuplot:

set xlabel "x1"
set ylabel "x2"
set zlabel "y"
splot "data.dat"

data

Python implementation:

# normal_equation.py
import numpy as np
import numpy.linalg as linalg

# Load data
data = np.loadtxt(fname="data.dat",dtype=np.float32)
# Create X_0 term
X0 = np.array([[1.0 for _ in range(len(data))]])
X1 = np.array([[ele[0] for ele in data]])
X2 = np.array([[ele[1] for ele in data]])
X = np.concatenate((X0,X1,X2),axis=0).T
y = np.array([[ele[2] for ele in data]]).T
theta = np.matmul(linalg.inv(np.matmul(X.T,X)),np.matmul(X.T,y))
print(theta)

np.matmul(X,Y) performs the matrix multiplication of 2 matrices X and Y, and linalg.inv(X) returns the inverse of the matrix X.

stdout: [[4.08673693], [2.82390453], [1.13030646]]

We can view the best fitting plane by typing this in gnuplot:

f(x,y) = 4.08673693+2.82390453*x+1.13030646*y
replot f(x,y)

data_fit

GitHub: https://github.com/DropletX/Data-Analysis/tree/master/Linear-Model/Python

Fitting a Linear Model by the Normal Equation

We want to find a set of \theta_{i} which best fits the equation:

y=\theta_{0}+\theta_{1}x_{1}+...+\theta_{n}x_{n}

, given the m training examples’ input values are

\begin{bmatrix}-(x^{(1)})^{T}-\\ -(x^{(2)})^{T}-\\ \vdots \\ -(x^{(m)})^{T}-\end{bmatrix}

, which is a m\times(n+1) matrix.

Notice that It is n+1 rather than n, since we may have a non-zero intercept term, so we add x^{(i)}_{0}=1 to every training example.

And the target values are

\begin{bmatrix}y^{(1)}\\ y^{(2)}\\ \vdots \\ y^{(m)}\end{bmatrix}

, which is a m\times1 vector.

We want to find a set of \theta_{i} such that J(\theta)=\frac{1}{2}(X\theta-\vec{y})^{T}(X\theta-\vec{y}) is minimized.

Perform the matrix derivative to J(\theta):

\nabla_{\theta}J(\theta)=X^{T}X\theta-X^{T}\vec{y}

Set the derivative to zero, hence

\theta=(X^{T}X)^{-1}X^{T}\vec{y}

sqlite3 (1) – Programming in an object-oriented style

SQLite is a C library that provides a lightweight disk-based database that doesn’t require a separate server process and allows accessing the database using a nonstandard variant of the SQL query language. sqlite3 is a Python package that provides a SQL interface.

To program in an object-oriented style:

import sqlite3

class Database:
    def __init__(self,db):
        """
        Connect to database or create a database if not exists.
        db: database file name.
        """
        self.conn = sqlite3.connect(db)
        self.cursor = self.conn.cursor()

    def __del__(self):
        """
        Close connection to database.
        """
        self.conn.close()

    def dbcommit(self):
        """
        Write changes to database.
        """
        self.conn.commit()

When an object is instantiated, it is connected to the database. db is the database file name, such as “stock_quotes.db”. If the file does not exist, a new database file will be created.

GitHub: https://github.com/DropletX/sqlite3/blob/master/sqlite3_1.py

Reference:

1) sqlite3 — DB-API 2.0 interface for SQLite databases

Fortran – Programming Primitive Vectors of Reciprocal Lattice

Suppose \vec{a_{1}}, \vec{a_{2}} and \vec{a_{3}} are the 3 primitive vectors of a direct lattice. The primitive vectors of the reciprocal lattice are given by

\vec{b_{1}}=2\pi\frac{\vec{a_{2}}\times \vec{a_{3}}}{\Omega_{a}}

\vec{b_{2}}=2\pi\frac{\vec{a_{3}}\times \vec{a_{1}}}{\Omega_{a}}

\vec{b_{3}}=2\pi\frac{\vec{a_{1}}\times \vec{a_{2}}}{\Omega_{a}}

, where \Omega_{a} is the volume of the primitive cell.

Fortran 90 code:

program main
  implicit none
  real(8), parameter :: PI=4.D0*datan(1.D0)
  real(8),dimension(3) :: a1,a2,a3,b1,b2,b3
  real(8),dimension(3) :: a,b
  real(8) :: Va

  print *,"Enter x, y, z coordinates of a1:"
  read(*,*)a1(1),a1(2),a1(3)
  print *,"Enter x, y, z coordinates of a2:"
  read(*,*)a2(1),a2(2),a2(3)
  print *,"Enter x, y, z coordinates of a3:"
  read(*,*)a3(1),a3(2),a3(3)
  Va = dot_product(a1,cross_product(a2,a3))
  b1 = cross_product(a2,a3)*2*PI/Va
  b2 = cross_product(a3,a1)*2*PI/Va
  b3 = cross_product(a1,a2)*2*PI/Va
  print *,"b1(1)=",b1(1),"b1(2)=",b1(2),"b1(3)=",b1(3)
  print *,"b2(1)=",b2(1),"b2(2)=",b2(2),"b2(3)=",b2(3)
  print *,"b3(1)=",b3(1),"b3(2)=",b2(2),"b3(3)=",b2(3)
  print *,"Volume of Unit Cell=",Va
contains
  function cross_product(x,y)
    implicit none
    real(8),dimension(3) :: x, y, z, cross_product
    z(1)=x(2)*y(3)-x(3)*y(2)
    z(2)=x(3)*y(1)-x(1)*y(3)
    z(3)=x(1)*y(2)-x(2)*y(1)
    cross_product=z
  end function cross_product

  real(8) function dot_product(x,y)
    implicit none
    real(8),dimension(3) :: x,y
    dot_product = x(1)*y(1)+x(2)*y(2)+x(3)*y(3)
  end function dot_product
end program main

 

The user inputs the 3 direct lattice vectors, and the program will print the 3 reciprocal lattice vectors and the volume of the direct lattice.

GitHub: https://github.com/DropletX/Solid-State-Physics/blob/master/reciprocal_lattice.f90