Blog | Autzoko Lang

THIS IS MY KINGDOM COME

You have a long flowerbed in which some of the plots are planted, and some are not. However, flowers cannot be planted in adjacent plots.

Given an integer array flowerbed containing 0s and 1s, where 0 means empty and 1 means not empty, and an integer n, return true if n new flowers can be planted in the flowerbed without violating the no-adjacent flowers rule and false otherwise.

Example:

1
2
Input: flowerbed = [1, 0, 0 ,0 ,1], n = 1;
Output: true;
1
2
Input: flowerbed = [1, 0, 0, 0, 1], n = 2;
Output: false;

Constraints:

  • $ 1 \le len_{flowerbed} \le 2 \times 10^4 $
  • $ flowerbed_i = 1 \or 0 $
  • $ 0 \le n le len_{flowerbed} $
  • There are no two adjacent flowers in flowerbed.

Solution

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
class Solution {
public:
bool canPlaceFlowers(vector<int>& flowerbed, int n) {
for (int i = 0; i < flowerbed.size(); i++) {
bool left = i == 0 || flowerbed[i - 1] == 0;
bool right = i == flowerbed.size() - 1 || flowerbed[i + 1] == 0;

if (left && right && flowerbed[i] == 0) {
flowerbed[i] = 1;
n--;
}
}

return n <= 0;
}
};

Explanation:

The basic idea of this solution is very simple: traverse the whole array, figure out that whether current place is legal for planting a new flower by looking left and looking right at each plot. If a plot is 0 and its left and right plots are both 0 either, it would a legal plot to plant, then we plant a new flower here by marking 1 and, accordingly, the total amount of flowers should be lessen. Also, here are two exceptional conditons, the boarder plots. So we need to judge them separatedly by looking right at the left margin and looking left at the right margin, the logic is same as other plots.

At last, we just need to simply check that whether all of the flowers have been planted, if yes return true, otherwise return false.

For two strings s and t. We say t divides s if and only $ s = t + t + … + t $ (i.e., t is concatenated with itself one or more times).

Given two strings str1 and str2, return the largest string x such that x divides boths str1 and str2.

Example:

1
2
Input: str1 = "ABABC", str2 = "ABC"
Ouput: "ABC"
1
2
Input: str1 = "ABABAB", str2 = "ABAB"
Output: "AB"
1
2
Input: str1 = "LEET", str2 = "CODE"
Output: ""

Constraints:

  • $ 1 \le len_{str1}, len_{str2} \le 1000 $
  • str1 and str2 consist of English uppercase letters.

Solution

1
2
3
4
5
6
7
class Solution {
public:
string gcdOfStrings(string str1, string str2) {
return (str1 + str2 == str2 + str1) ?
str1.substr(0, gcd(size(str1), size(str2))) : "";
}
};

Explanation:
For two strings, if they are consisted by the same substrings (or same ingradient), their combination in opposite order should be the same. For example, is string A is ababab, string b is abab, it’s obvious that they have the same ingradient ab, so let’s combine them as $ a + b $ and $ b + a $, their results should be same. So, after we make sure that both two strings are made of the same ingradient, we can use gcd to find their greatest common division.

Proof:
Assume $str_1=nX$ and $str_2=mX$ where $X$ represents the common divisor of these two strings. Thus, the concatenated string $str_1+str_2$ equals $(n+m)X$. We get substring of length $\mathrm{gcd}(len_1, len_2)$, if this length equals the size of $str_2$, which means the size of $str_2$ is divisible by $str_1$, any division should be equal when divised by $\mathrm{gcd}(str_1, str_2)$. The core is that $str_1 + str_2 = str_2 + str_1$.

General Solution:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
class Solution {
bool check(string t, string s) {
int lenx = (int)s.length() / (int)t.length();
string ans = "";
for (int i = 0; i < lenx; i++) {
ans = ans + t;
}
return ans == s;
}
public:
string gcdOfStrings(string str1, string str2) {
int len1 = (int)str1.length(), len2 = (int)str2.length();
for (int i = min(len1, len2); i >= 1; i--) {
if (len1 % i == 0 && len2 % i == 0) {
string X = str1.substr(0, i);
if (check(X, str1) && check(X, str2)) return X;
}
}
return "";
}
}

基于黑马程序员视频
MyBatisPlus Official Site: MyBatisPlus

MybatisPlus简化MyBatis的CRUD操作

MyBatis实现CRUD一般需要在对应的mapper.xml中进行SQL语句编写以实现相关功能,编写大量的单表SQL语句比较繁琐(尤其是改查等操作,需要大量的IF条件语句),MyBatisPlus简化了相关的代码流程。

使用MyBatisPlus

引入依赖包: A classical springboot autoload dependency

1
2
3
4
5
<dependency>
<groupId>com.baomidou</groupId>
<artifactId>mybatis-plus-boot-starter</artifactId>
<version>3.5.3.1</version>
</dependency>

定义Mapper:
自定义的Mapper继承MyBatisPlus提供的BaseMapper接口:

1
2
public interface UserMapper extends BaseMapper<User> {
}

这个BaseMapper中提供了大量的增删改查预制方法。注意BaseMapper的泛型指定为所需的实体类的类型(对应位代码中的User类)。

在使用CRUD方法是,直接调用UserMapper即可使用BaseMapper中预先设置好的增删改查方法。无需再写mapper中的自定义方法。当然如果仍需要自定义较为复杂度增删改查方法,直接在UserMapper中和mapper.xml中自己编写SQL即可。

MyBatisPlus常见注解

MyBatisPlus通过扫描实体类,并基于反射获取实体类信息作为数据库表信息。

通过一些约定来找到表、字段等:

  • 类名驼峰转下划线为表名
  • 名为id的字段作为主键
  • 变量名驼峰转下划线作为表的字段名

若实体类不符合这些约定,则需要一些注解来指定这些映射:

  • @TableName: 指定表名
  • @TableId: 指定表中的主键字段名
  • @TableField: 指定表中普通字段信息
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
@TableName("tb_user")  // 'tb_user' is the table name in database
public class User {
@TableId("id", type=IdType.AUTO) // 'id' here is the primary key name in database
private Long user_id; // here the name of primary key is different with the name in database

@TableField("name")
private String userName;

@TableField("is_married")
private Boolean isMarried;

@TableField("`order`") // using escape character `` to avoid conflict with SQL keywords, like 'order' is a keyword of 'order by' sentence
private Integer order;

@TableField(exist=false)
private String address; // 'address' is not a field in database;

}

IdType一般常用的有以下几种类型:

  • AUTO:数据库自增长(AUTO_INCREMENT)
  • INPUT:通过SET方法自行输入
  • ASSIGN_ID:分配ID,接口IdentifierGenerator的方法nextId来生成Id,默认实现类为DefaultIdentifierGenerator雪花算法

使用@TableField的常见场景:

  • 成员变量名与数据库字段不一致
  • 成员变量名以is开头,且为布尔值
  • 成员变量名与数据库关键字冲突
  • 成员变量不是数据库字段

Given a integer $n$, return the number of structurally unique BST‘s (binary search trees) which has exactly $n$ nodes of unique values from $1$ to $n$.

Example:

1
2
Input: n = 3;
Output: 5;

Constraints:

$1 \le n \le 19$

Solution

1
2
3
4
5
6
7
8
9
10
11
12
13
14
class UniqueBinarySearchTrees {
public:
int numTrees(int n) {
int *dp = new int[n + 1];
fill_n(dp, n + 1, 0);
dp[0] = dp[1] = 1;
for(int i = 2; i <= n; i++) {
for(int j = 1; j <= i; j++) {
dp[i] += dp[j - 1] * dp[i - j];
}
}
return dp[n];
}
};

Explanation:

Assume that the number of BSTs of $n$ nodes is $G(n)$, and $f(i)$ denotes the number of BSTs whose root value is $i$, thus:

And when $i$ is the root node, the number of nodes of its left subtree is $i-1$, and the number of nodes of its right subtree is $n-i$, thus

Fusing above two equations, the Catalan Number Equation is deduced:

北京北站-出發

從京城的西直門出發,沿著曾經詹天佑修建的京張鐵路,經過張家口,便徹底離開了偌大的華北平原,進入了山與高原交錯的地界了。

大同府

大同府坐落在三晉大地的大同盆地之中,雁門関北,自古屬雲州轄,是為山外所植。大同府自古是溝通中原與北方游牧諸族的要會,北魏於此建都平城,於此開啓鮮卑漢化的歷史進程。宋代這裏也屬北宋未曾接管的燕雲十六州。與此毗鄰的代、朔、應、忻等地向來是邊境前綫所在。

大同在現今山西省的北端,往北走去便是内蒙古草原地帶,往南有太岳、呂梁、太行三山橫亙,擋住了前往中原的通路,只有幾條山中狹徑供軍民使用。

三晉邊關

在群山懷抱之中,大同得天獨厚的地理條件是的南下的游牧民族政權都會優先選擇這裏作爲一個重要的城市,無論是鮮卑之平成,還是契丹之西京,這裏是游牧民族南下的第一步,是漢地十八省的北部前哨。所以大同府的歷史也是一個充滿軍事鐵血與戰爭血淚的歷史。

There is a large block of cheese with a height of $h$. We can assume its length and width are infinitely large. Inside this cheese, there are many spherical holes of the same radius. We can establish a spatial coordinate system within this cheese, where the bottom surface of the cheese is at $z=0$ and the top surface of the cheese is at $z=h$.

Currently, there is a little mouse named Jerry on the bottom surface of the cheese. Jerry knows the coordinates of the centers of all the holes in the cheese. If two holes are tangent or intersect, Jerry can move from one hole to another. Specially, if a hole is tangent to or intersects the bottom surface, Jerry can enter the hole from the bottom surface of the cheese; if a hole is tangent to or intersects the top surface, Jerry can exit from the hole to the top surface of the cheese.

Jerry, located on the bottom surface of the cheese, wants to know if he can utilize the existing holes to reach the top surface of the cheese without damaging the cheese.

Each input stream contains varies of data.

In the first line, it includes an integer $T$ which indicates the total number of input data lines. Then it followed by $T$ lines of data, each line is formatted as $n,h,r$, separated by space, representing number of holes, height of cheese, and the radius of holes. The following $n$ lines contains three integers $x,y,z$, separated by space, representing the coordinate of each hole.

Solution

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
class Cheese {
private:
vector<int> parent;
vector<int> rank;
vector<Long64> X;
vector<Long64> Y;
vector<Long64> Z;
Long64 radius;
Long64 height;

bool isConnect(Long64 x1, Long64 y1, Long64 z1, Long64 x2, Long64 y2, Long64 z2, Long64 r) {
return pow((x1 - x2), 2) + pow((y1 - y2), 2) + pow((z1 - z2), 2) <= 4 * r * r;
}

int find(int x) {
return x == parent[x] ? x : (parent[x] = find(parent[x]));
}

void merge(int x, int y) {
int rootX = find(x);
int rootY = find(y);
if (rootX != rootY) {
if (rank[rootX] < rank[rootY]) {
parent[rootX] = rootY;
} else if (rank[rootX] > rank[rootY]) {
parent[rootY] = rootX;
} else {
parent[rootX] = rootY;
rank[rootY]++;
}
}
}

public:
Cheese(vector<Long64> X, vector<Long64> Y, vector<Long64> Z, Long64 radius, Long64 height) {
X = X;
Y = Y;
Z = Z;
radius = radius;
height = height;

int n = X.size();

// use n + 1 and n + 2 to represent top and bottom
parent.resize(n + 2);
rank.resize(n + 2, 1);

for (int i = 0; i < n + 2; i++) {
parent[i] = i;
}
}

bool canReach() {
int n = X.size();
for (int i = 1; i <= n; i++) {
if (Z[i] <= radius) {
merge(i, n + 1);
}
if (Z[i] + radius >= height) {
merge(i, n + 2);
}
}

for (int i = 1; i <= n; i++) {
for (int j = i + 1; j <= n; j++) {
if (isConnect(X[i], Y[i], Z[i], X[j], Y[j], Z[j], radius)) {
merge(i, j);
}
}
}

return find(n + 1) == find(n + 2);
}
};

Explanation:

This is a typical Merge-find problem. We can split the holes into two sets: those whom connected to the bottom surface and those whom connected to the top surface. If a hole tangent to or intersects the bottom surface, then merge it into the bottom set, similarly, we can merge those whom are connected to the top surface to the top set. To determine whether there are holes can form a pathway, we just need to know whether the top surface and the bottom surface are in the same set. All above can be achieved by using merge-find algorithm.

Merge-find(并查集), also known as Disjoint-set, is a data structure that efficiently handles the dynamic connectivity problem. It supports two primary operations: Union (to merge two sets) and find (to determine the representative or root of a set). This structure is especially useful in algorithms that deal with partitioning elements into disjoint subsets, such as Kruskal’s algorithm for finding the Minimum Spanning Tree.

Key Concepts

Data Structure

  • Parent Array: Each element has a pointer to its parent. For a representative element (root), the parent points to itself.
  • Rank Array (or Size Array): Keeps track of the rank (or size) of the tree rooted at a particular element. It helps in optimizing the union operation by always attaching the smaller tree under the larger one.

Operations

  1. Find:

    • Determines which subset a particular element is in.
    • Can be used to check if two elements are in the same subset.
    • Uses path compression to flatten the structure, making future operations faster.
  2. Union:

    • Merges two subsets into a single subset.
    • Uses union by rank or union by size to keep the tree shallow, improving efficiency.

Pseudocode and Explanation

Initialization

  • Parent Array: Each element is its own parent initially.
  • Rank Array: Initialize rank or size of each element as $0$ or $1$.

Find with Path Compression

Pseudocode:

1
2
3
4
5
6
int find(int x) {
if (parent[x] != x) {
parent[x] = find(parent[x]);
}
return parent[x];
}

Explanation:

  • Recursively traverses the parent pointers until the root is found.
  • Path compression flattens the tree by making every node point directly to the root, reducing the time complexity for future operations.

Union by Rank

Pseudocode:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
void union(int x, int y) {
int rootX = find(x);
int rootY = find(y);

if (rootX != rootY) {
// union by rank
if (rank[rootX] > rank[rootY]) {
parent[rootY] = rootX;
} else if (rank[rootX] < rank[rootY]) {
parent[rootX] = rootY;
} else {
parent[rootY] = rootX;
rank[rootX]++;
}
}
}

Explanation:

  • Find the roots of the elements $x$ and $y$.
  • Attach the smaller tree under the larger tree to keep the overall structure balanced.
  • If both trees are of the same rank, make one the parent of the other and increase its rank.

Complete C++ Implementation

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
class UnionFind {
private:
vector<int> parent;
vector<int> rank;

public:
UnionFind(int n) {
parent.resize(n);
rank.resize(n, 0);
for (int i = 0; i < n; i++) {
parent[i] = i;
}
}

int find(int x) {
if (parent[x] != x) {
parent[x] = find(parent[x]);
}
return parent[x];
}

void merge(int x, int y) {
int rootX = find(x);
int rootY = find(y);

if (rootX != rootY) {
if (rank[rootX] > rank[rootY]) {
parent[rootY] = rootX;
} else if (rank[rootX] < rank[rootY]) {
parent[rootX] = rootY;
} else {
parent[rootY] = rootX;
rank[rootX]++;
}
}
}

bool connected(int x, int y) {
return find(x) == find(y);
}
};

The Knapsack Problem (背包问题) is a classic optimization problem in computer science and mathematics. It involves a set of items, each with a weight and a value, and a knapsack with a maximum weight capacity. The goal is to determine the most valuable combination of items to include in the knapsack without exceeding its weight capacity.

Variants of the Knapsack Problem

0/1 Knapsack Problem

In this version, each item can either be included in the knapsack or excluded; you can’t take a fraction of an item. Formally, given $n$ items with weights $w_1,w_2,…,w_n$ and values $v_1, v_2,…,v_n$, and a knapsack with capacity $W$, the problem is to maximize the total value $V$ while ensuring that the total weight $W’$ dose not exceed $W$.

Mathematical formulation:
Maximize:
$$
V=\sum_{i=1}^n v_i x_i
$$
Subject to:
$$
\sum_{i=1}^n w_i x_i \le W
$$
Where $x_i$ is $0$ or $1$, indicating whether item $i$ is included.

Fractional Knapsack Problem

In this version, you can take fractions of items. This allow for a greedy algorithm to achieve the optimal solution by taking items in decreasing order of their value-to-weight ratio until the knapsack is full.
Mathematical formulation:
Maximize:
$$
V=\sum_{i=1}^{n}v_i x_i
$$
Subject to:
$$
\sum_{i=1}^n w_i x_i \le W
$$
Where $0 \le x_i \le 1 $, indicating the fraction of item $i$ included.

Solution Approaches

Dynamic Programming

Used primarily for the 0/1 knapsack problem. It builds up a table of solutions to subproblems to find the optimal solution.

In the context of the 0/1 kanpsack problem, the transition equation helps build the dynamic programming solution. This equation is used to determine the maximum value achieveable by either including or excluding an item based on previous solutions. Let’s delve into the specifics.

Dynamic Programming Transition Equation for 0/1 Knapsack Problem

For the 0/1 knapsack problem, the transition equation determines the maximum value $dp_{i,w}$ achievable using the first $i$ items with a maximum weight capacity $w$. The equation can be expressed as:
$$
dp_{i,w}=max(dp_{i-1,w}, dp_{i-1, w-w_i} + v_i)
$$
Where:

  • $dp_{i,w}$ is the maximum value achievable with the first $i$ items and capacity $w$.
  • $v_i$ and $w_i$ are the value and weight of the $i$-th item.
  • $dp_{i-1,w}$ represents the maximum value without including the $i$-th item.
  • $dp_{i-1, w-w_i} + v_i$ represents the maximum value including the $i$-th item, given that $w-w_i$ must be non-negative.

Explanation:

  1. Initialzation:
    • $dp_{0,w}=0$ for all $w$, meaning if no items are considered, the maximum value is zero regradless of the weight capacity.
  2. Recurrence:
    • For each item $i$ and each weight $w$
      • If the item’s weight $w_i$ is greater than $w$, the item cannot be included, so the maximum value remains $dp_{i-1,w}$.
      • If the item’s weight $w_i$ is less than or equal to $w$, consider two cases:
        • Exclude the item: The maximum value is $dp_{i-1,w}$.
        • Include the item: The maximum value is $dp_{i-1, w-w_i} + v_i$.
      • Take the maximum of these two cases.

Implementation

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
int knapsack_DP(const vector<int>& values, const vector<int>& weights, int W) {
int n = values.size();

// create a 2-D DP table to store the maximum value for each subproblem
vector<vector<int>> dp(n + 1, vector<int>(W + 1, 0));

// iterate through each item
for (int i = 1; i <= n; i++) {
for (int w = 0; w <= W; w++) {
if (weights[i - 1] <= w) {
dp[i][w] = max(dp[i - 1][w], dp[i - 1][w - weights[i - 1]] + values[i - 1]);
} else {
dp[i][w] = dp[i - 1][w];
}
}
}

return dp[n][W];
}

Greedy Algorithm

Effective for the fractional knapsack problem. It involves selecting items based on their value-to-weight ratio until knapsack is full.

Implementation

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
struct Item {
int value;
int weight;
double value_per_weight;
};

bool compare(const Item& a, const Item& b) {
return a.value_per_weight > b.value_per_weight;
}

double fractionalKnapsack(const vector<int>& values, const vector<int>& weights, int W) {
int n = values.size();
vector<Item> items(n);

for (int i = 0; i < n; i++) {
items[i] = {values[i], weights[i], (double)values[i] / weights[i]};
}

sort(items.begin(), items.end(), compare);

double total_value = 0.0;
int current_weight = 0;

for (const auto& item : items) {
if (current_weight + item.weight <= W) {
current_weight += item.weight;
total_value += item.value;
} else {
int remain = W - current_weight;
total_value += item.value_per_weight + remain;
break;
}
}

return total_value;
}

What is P-Problem and NP-Problem

Before introducing the concept of P-problem and NP-problem, we need to know something about time complexity.

Some types of time complexity of a program are proper to computer, which means the time usage is affordable, or worthy. And some others types are not that time-saving! So, to categorize those algorithms, computer scientists discover that some specific pattern of time complexity are time-saving, and others are not. Those time-efficient time complexity are polynomial, which is letter P in P-problem stands for, and others that can not determine whether this problem can be solved in polynomial time, then they are nondeterministic polynomial, which is the so-called NP. Problems in P are considered efficiently solvable.

  • Time complexity like $O(1), O(n), O(\\log n), O(n^c)$ are called polynomial.
  • Time complexity like $O(c^n), O(n!)$ are called non-polynomial.

So after understanding the term polynomial means in computer algorithm efficiency, we can go on to known the defination of P-Problem and NP-Problem.

These conceptions actually are discussing the sovlability of a problem.

P Problem

If a problem can be solved in polynomial time, this problem can be deemed as a P Problem,

NP Problem

Given a problem, if any of its proposed solution can be varified within polynomial time, this problem can be defined as a nondeterministic polynomial problem.

And it is important that many problems in NP have no known polynomial-time solutions.

Relationship between P-Problem and NP-Problem

First, it is obvious that all of P-Problem are NP-Problem. However, does it still is a true statement when reversed? This is hard to say, so this question becomes the famous $P=NP?$ problem.

NPC Problem

The NP-Complete problem is a subset of of NP problems that are as hard as any problem in NP. If any NP-complete problem can be solved in polynomial time, then every problem in NP can be solved in polynomial time.

In short, the NP-Complete problems are the hardest NP problems!

Suppose here is an NP problem, all of NP problem can be reduced to it. Then if only find the polynomial time solution of this problem, every NP problem can find a polynomial time solution by reducing to that NP problem.

And how to prove a problem is NP-Complete?

Steps to Prove a Problem is NP-Complete

  1. Show the Problem is in NP
    Verification in polynomial time: demonstrate that given a candidate solution, you can verify its correctness in polynomial time.
  2. Select a Known NP-Complete Problem
    Source problem: choose a problem already known to be NP-Complete for the reduction.
  3. Construct a Polynomial Time Reduction
    Transform: develop a polynomial time algorithm that converts instances of the chosen NP-Complete problem to instances of the problem you are proving NP-Complete
    1. Input transformation: map any instance of the known NP-Complete problem to an instance of your problem.
    2. Solution transformation: ensure that a solution to the transformed instance of your problem corresponds to a solution to the original NP-Complete problem.

In summary, only two steps needs to be done:

  • first show you can verify solutions in polynomial time.
  • then transform a known NP-Complete problem to your problem in polynomial time.

Common NP-Complete Problems for Reductions:

  • Satisfiability (SAT)
  • 3-SAT
  • Clique
  • Vertex Cover
  • Hamiltonian Path (Cycle)
  • Traveling Salesman Problem (TSP)

Given an lexicographical order array $words$ with size of $n$, the strings among $words$ are all writen in lower case letters. And given a specific string $prefix$ with a length of $m$, count the number of strings in $words$ that have a prefix of $prefix$.

Example:

1
2
Input: ["aaa", "ba", "bac", "bat", "batbat", "batq", "bta", "c"];
Output: 3;

Constraints:

  • The time complexity should be $O(m\log n)$

Solution

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
class CountPrefix{
private:
string prefix;
vector<string> words;

bool hasPrefix(string& word, string& pref) {
if (word.size() < pref.size()) return false;
for (int i = 0; i < pref.size(); i++) {
if (word[i] != pref[i])
return false;
}
return true;
}

int findFirst(int left, int right) {
int low = left, high = right, label = -1;
while (low <= high) {
int mid = (low + high) / 2;
if (hasPrefix(words[mid], prefix)) {
label = mid, high = mid - 1;
} else if (prefix[0] < words[mid][0]) {
high = mid - 1;
} else {
low = mid + 1;
}
}
return label;
}

int findLast(int left, int right) {
int low = left, high = right, label = -1;
while (low <= high) {
int mid = (low + high) / 2;
if (hasPrefix(words[mid], prefix)) {
label = mid, low = mid + 1;
} else if (prefix[0] < words[mid][0]) {
high = mid - 1;
} else {
low = mid + 1;
}

}
return label;
}

public:
CountPrefix(vector<string> words, string prefix) : words(words), prefix(prefix) {}

int count() {
int from = findFirst(0, words.size() - 1);
int to = findLast(0, words.size() - 1);

if (from < 0) return 0;
else return to - from + 1;
}
};

Explanation:

The main idea of this algorithm is using binary search to find the boundary of words which contain the specific prefix. The function $findFirst$ denotes the position where first word has the prefix and function $findLast$ denotes the last one. The only difference between $findFirst$ and $findLast$ is that when hit a proper case, $findFirst$ continues to search the section before but the $findLast$ search the section behind. By doing this in repetition, the first one and the last one will be found.

About the time complexity, the time complexity of function $hasPrefix$ is $O(m)$. Suppose the time complexity of function $findFirst$ and $findLast$ is $T(n)$, then
$$
T(n)=T(\frac{n}{2}) + O(m), T(1) \in O(m),
$$
which means $T(n) \in O(m \log n)$. Thus the time complexity of $countPrefix$ is $O(m\log n)$.

0%