Given an integer array coins representing coins of different denominations and integer amount representing a total amount of money.
Return the fewest number of coins that you need to make up that amount. If that amount of money cannot be made up by any combination of coins, return $-1$.
$dp_i$ represents the minimum number of coins that makes up the amount $i$. Assume that the values from $dp_0$ to $dp_{i-1}$ are already known, thus the transition equation of $dp_i$ can be deduced as $$ dp_i = min_{j=0}^{l_{coins}-1}dp_{i-coins_j}+1 $$ which means the value of $dp_i$ is determined by $dp_{i-coins_j}$. If $dp_i$ can transfer from $dp_{i-coins_j}$ via $coins_j$, the number of coins that used should add $1$.
SHA-256 encryption algorithm is a division from the SHA-2 group, a.k.a. Secure Hash Algorithm 2. It can be known from the name that, the SHA is a type of hash function. This encryption algorithm encode the input message by hashing to encrypt and compress. The result of SHA encryption usually contains random alphanumeric characters.
Explanations of SHA-256 Algorithm
Before explicitly explain the detailed algorithm of SHA-256, it is essential to figure out what should be done in advance. Three sections can be extracted: the constants that are used in the algorithm, how to preprocess the data, and the logical operations that are used. After understanding these, the algorithm itself is easy to handle.
Before Dive into SHA-256
Constants
In the SHA-256 algorithm, 8 hash initial values and 64 hash constants are used. First, the eight initial values are as follow:
Finally, the length of the original message should be appended to the end of current data by big endian.
Logical Operations
All of logical operations that involved with SHA-256 algorithm are as follow: $$ Ch(x,y,z)=(x \land y) \oplus (\urcorner x \land z) $$ $$ Maj(x,y,z)=(x\land y)\oplus (x \land z) \oplus (y \land z) $$ $$ \Sigma_0(x) = S_R^2(x) \oplus S_R^{13}(x) \oplus S_R^{22}(x) $$ $$ \Sigma_1(x) = S_R^6(x) \oplus S_R^{11}(x) \oplus S_R^{25}(x) $$ $$ \sigma_0(x) = S_R^7(x) \oplus S_R^{18} \oplus R^3(x) $$ $$ \sigma_1(x) = S_R^{17}(x) \oplus S_R^{19}(x) \oplus R^{10}(x) $$
Explanations:
$\land$: and.
$\urcorner x$: not.
$\oplus$: xor.
$S_R^n$: Cycled right shift.
$R^n$: Right shift.
Abstraction
How here comes to ultimately main part of the SHA-256 algorithm! All the details related to the algorithm will be elaborate here.
The first things is: break message into 512-bit chunks.
Suppose that the message $m$ can be splited into $n$ chunks, so the whole algorithm needs to iterate for $n$ times in order to get a 256-bit hash result. Initially, a 256-bit initial value $H_0$ is given, then operate with data chunk 1 and get $H_1$. Then operate with data chunk 2 and get $H_2$. Repeat above step until all data chunks are operated.
The pseudo-code are:
1 2 3 4 5
data_chunks = divide_by_512_bit(message) handle = H0 for (chunk in data_chunks): handle = operate(handle, chunk) result = handle
Among each operation, each data chunk with 512-bit chunk are divided into 16 words, each word contains 32 bits and encoded with big endian, as $W_0,…,W_{15}$. Which means the first sixteen words can be directly derived from the $i^{th}$ chunk data. The rest words are derived through following equation:
Then, iterate by 8 words for 64 times follow these rules:
$$ A=H+\Sigma_1(E)+Ch(E,F,G)+K_i+W_i+Maj(A,B,C)+\Sigma_0(A) $$ $$ B=A $$ $$ C=B $$ $$ D=C $$ $$ E=D+H+\Sigma_1(E)+Ch(E,F,G)+W_i+K_i $$ $$ F=E $$ $$ G=F $$ $$ H=G $$ Notice that: the initial value of word A-H is $H_0$-$H_7$. And $K_i$ is the $i^{th}$ value of the initial hash value.
This notes is according to MIT’s course: missing-semester: Editors(Vim)
Vim
Vim is a command line based code editor, designed for programmers and designed by programmers. And other visual editors like VS Code can also integrate vim mode for advanced coding experience.
The core philosophy of vim editor is command, all operations are done by typing keys, using without mouse. After being familiar with these key settings, the speed of programming can be extremely improved.
Modes of vim
Vim contains several operating modes for different using:
Normal: navigating, moving, and simple editing. Press ESC for back to normal mode.
Insert: inserting text. Press i for switching to insert mode.
Replace: replacing text. Press shift-r for switching to replace mode.
Visual:
Visual-plain: selecting text by cursor movement. Press v.
Visual-line: selecting text by line. Press shift-v.
Visual-block: selecting text by block. Press ctrl-v.
Command-line: running command. Press shift-;.
Command-line
By pressing : to get into command-line mode, where can send command to operating thw vim in a whole.
:q | quit (or close current window).
:w | write and save.
:wq | save and quit.
:e {filename} | open file and edit.
:ls | show open buffers.
:help {topic} | open help
:help :w | help info for command :w.
:help w | help info for command w.
Cursor movement
Basic movement: hjkl | left, down, up, right.
Move by word: w, b, e | next word, beginning of word, end of word.
Move by line: 0, ^, $ | beginning of line, first non-blank character, end of line.
Move by screen: H, M, L | top of screen, middle of screen, bottom of screen.
Scroll: ctrl-u, ctrl-d | scroll up, scroll down.
Move by file: gg, G | beginning of file, end of file.
Misc: % | corresponding item.
Find: f{char}, t{char}, F{char}, T{char} | find\to forward\backward character on the current line.
Search: /{regex}, n, N | navigating match items.
Edits
i | enter insert mode.
o, O | insert line below, insert line above.
d{motion} | delete by motion command.
dw | delete word.
d$ | delete to end of line.
d0 | delete to beginning of line.
c{motion} | change by motion command.
x | delete character.
s | substitute character.
u | undo.
ctrl-r | redo.
y | copy
p | paste.
Counts
By using command pattern {number}{command} can do this command by times.
3w move 3 words forward.
5j move 5 lines down.
7dw delete 7 words.
Modifiers
While using delete or change command, add i or a can specify the command is done inside or around.
ci( | change content inside current pair of parentheses.
ci[ | change content inside current pair of square brackets.
da' | delete a single-quoted string, including the surrounding single quotes.
Attention is All You Need is a paper published in 2017 by Google Brain team. In this paper, the prominent self-attention mechanism and multi-headed self-attention mechanism are proposed to enhance the general capacity of neural networks. Meanwhile, a basic neural network architecture Transformer is proposed to adopt such mechanism and its been tested on translation task.
What is Attention?
The attention mentioned in neural network is similar with the one we mentioned about human intelligence: an ability to capture information which seems to be more important (subjectively). The “attention mechanism” in human congnition is prominently manifested across various sensory perceptions. For instance, when using our eyes, certain features such as bright colors, rapid movements, or familiar faces tend to capture our gaze. Similarly, during reading, specific names, events, or elegant phrases can leave a lasting impression. This attention mechanism also extends to other senses like smell, touch, and hearing. Furthermore, the stream of consciousness during thought processes is often influenced by particularly striking information, all of which illustrate the attention mechanism in human cognition.
In the context of a nerual network, what does the attention mechanism refer to? For example, when we want a neural network to process an image, how does it determine which elements of the image are more important. Consider objects like people, pets, or cars in an image—how does the network decide which one is more noteworthy in the image’s “context”? The attention mechanism addresses this problem. When processing an image, the neural network assigns higher weights, or “attention scores”, to what it deems more important information. During subsequent forwarding process, the high-weight information is emphasized, enabling the network to capture more detailed and deeper features. This, in turn, aids in handling the semantic importance of the image. Besides image processing, the mechanism is applicable to text, audio, graph, and other data types. Since data in neural networks exists as discrete tensors, the fundamental principles of calculating attention scores remain consistent across different data types.
Self-Attention Mechanism
The so-called self-attention mechanism refers to a process where the attention scores are computed by comparing the elements within the sequence itself. In other words, the reference for calculating attention scores is the sequence as a whole. For a tensor sequence input into a neural network, the self-attention mechanism requires computing the attention scores between each element and every other element in the sequence. The basic process can be represented in pseudocode as follows:
1 2 3 4 5
for (element_x : input_sequence) { for (element_y : input_sequence) { attention_score[x][y] = compute_score(element_x, element_y); } }
However, for deep learning tasks, using to nested loops to compute values is highly time-consuming, especially when the input sequence length can be in the tens of thousands. Therefore, researchers use vectorized operations to calculate all pairwise attention scores efficiently. By using vectorized multiplication, we can obtain all attention scores with a single dot product operation between the input tensor sequence and its transposed version.
Through this matrix multiplication operation, combined with parallel computing technology and the parallel processing capabilities of GPU devices, the efficiency of this computation can be greatly enhanced. To ensure that the attention scores are not fixed and be gradually optimized during the training process, we use parameter matrices to adjust the calculation of attention scores. This enable the model to truely capture the information that deserves attention. Thus, the equation for computing attention score can be expressed as follows: $$ Attention(Q,K,V) = softmax(\frac{QK^T}{\sqrt{d_k}})V $$ Here, $ Q, K, V $ refer to each element in the input sequence being multiplied by three shared weighted matrices $ W_q, W_k $ and $ W_v $. The data multiplied by $ W_q $ represents the element that will compute attention scores with other elements in the sequence. The data multiplied by $ W_k $ represents the data being compared. After comparing each element with every other element, the resulting sequence is the attention score sequence for that element. This sequence is processed through the softmax function and then multiplied by $ W_v $ to perform a weighted sum. The entire process essentially calculates the correlations between all elements and the performs a weighted sum, with the result being the output of the attention mechanism. Through this process, the information that is considered more important is multiplied by greater weights, as a result their “information” can be magnified in the subsequent processes.
Transformer Model
The Transformer model adopts an encoder-decoder architecture which is a popular architecture in current deep learning field. Two independent models are separatedly used as an encoder and a decoder. For most cases, they are used to process different types of data. For example, in image captioning, encoder is used for process image signal and decoder is for generating captions. In this paper, the Transformer is tested on English-German translation task, so, unprecisely, you can image the encoder is trying to “understand” a sentence in English and then generate some kind of metaphysical or abstract information, these information contains the concepts, subjects and objectives that are “mentioned” in the original language. Then, these information is passed to decoder, and the task of decode is to decrypt these information into German.
Importantly, the comparison I mentioned above is extremely unprecise, this model cannot translate as human do, in fact it’s still a prediction or simulation, but not understanding.
private: int _min_(int i, int j) { return i < j ? i : j; } int _max_(int i, int j) { return i > j ? i : j; } int _trimin_(int x, int y, int z) { return _min_(x, y) < _min_(y, z) ? _min_(x, y) : _min_(y, z); } };
Explanation:
Use matrix[i][j] to represents the edit distance between string word1[0…i] and word2[0…j]. The index of string’s content begins at 1, matrix[0][0] indicates the case that word1 and word2 are both vacant, which means their edit distance is 0. So, it is obvious that matrix[0][j] is the case when word1 is empty and the length of word2 is j, and their edit distance should be just j. Similarly, matrix[0][i] is the case when the length of word1 is i and word2 is empty, their edit distance is i.
Then we can deduce the equation for the above three operations accordingly:
matrix[i - 1][j] + 1: insert the last character of word1 into the tail of word2
matrix[i][j - 1] + 1: delete the last character of word2
matrix[i - 1][j - 1] + 1: replace the last character of word2 by the last character of word1
voidp(string str, int left, int right){ if(left == 0 && right == 0) { res.push_back(str); return; } if(left == right) { p(str + "(", left - 1, right); } else { if(left > 0) { p(str + "(", left - 1, right); } p(str + ")", left, right - 1); } } };
Explanation:
The number of remained left parenthese must be less or equal to the number of remained right parenthese. If the number of remained left parenthese equals to the remained right one, the next can only be a left parenthese. If the number of remained left parenthese is less than the right one, the next one can be either a left parenthese or a right parenthese. This problem can be solved by using recursion algorithm.
Using a dp array to indicate the current minimum path sum. And this sum can only be the sum of the up dp point and current grid point or the left dp point and current grid point. The dp function can be defined as follow:
Welcome to my personal blog. I open this blog when I am about to graduate from my university, at the time I extremely want to use something to document my life and things I’ve learned. And now I’m on my way to the United States to start a brand new experience of studying. I want to keep some memories through this blog, and meanwhile take some notes about the things that might be useful to my future development.
About This Blog
Due to my major is computer science/engineering, so most of study notes are related to fields like computer science, computer engineering, software developing, etc., I have to admit that I’m not excel at such subjects, but I am still keen on them and wish to make something big from my own hands. Hope this blog can witness my improvements :)
This blog does not only focus on such academic matters. For other parts, I will upload some of my daily life records, like photos, music, and maybe some unexpected inspiration and ideas. And I will write each blog article in English as possible as I can, to improve my English writing skills, and maybe it’s benefit to my future experience in the US🙂
Hope you enjoy my blog, and discover something new!