位置：首页-资讯-后端开发

Wu-Manber算法简介及Python实现说明

2024-01-23 22:09

短信预约 -IT技能 免费直播动态提醒

Wu-Manber算法是一种字符串匹配算法，用于高效地搜索字符串。它是一种混合算法，结合了Boyer-Moore和Knuth-Morris-Pratt算法的优势，可提供快速准确的模式匹配。

Wu-Manber算法步骤

1.创建一个哈希表，将模式的每个可能子字符串映射到该子字符串出现的模式位置。

2.该哈希表用于快速识别文本中模式的潜在起始位置。

3.遍历文本并将每个字符与模式中的相应字符进行比较。

4.如果字符匹配，则可以移动到下一个字符并继续比较。

5.如果字符不匹配，可以使用哈希表来确定在模式的下一个潜在起始位置之前可以跳过的最大字符数。

6.这允许算法快速跳过大部分文本，而不会错过任何潜在的匹配项。

Python实现Wu-Manber算法

# Define the hash_pattern() function to generate
# a hash for each subpattern


def hashPattern(pattern, i, j):
h = 0
for k in range(i, j):
h = h * 256 + ord(pattern[k])
return h

# Define the Wu Manber algorithm


def wuManber(text, pattern):

# Define the length of the pattern and
# text
m = len(pattern)
n = len(text)

# Define the number of subpatterns to use
s = 2

# Define the length of each subpattern
t = m // s

# Initialize the hash values for each
# subpattern
h = [0] * s
for i in range(s):
h[i] = hashPattern(pattern, i * t, (i + 1) * t)

# Initialize the shift value for each
# subpattern
shift = [0] * s
for i in range(s):
shift[i] = t * (s - i - 1)

# Initialize the match value
match = False

# Iterate through the text
for i in range(n - m + 1):
# Check if the subpatterns match
for j in range(s):
if hashPattern(text, i + j * t, i + (j + 1) * t) != h[j]:
break
else:
# If the subpatterns match, check if
# the full pattern matches
if text[i:i + m] == pattern:
print("Match found at index", i)
match = True

# Shift the pattern by the appropriate
# amount
for j in range(s):
if i + shift[j] < n - m + 1:
break
else:
i += shift[j]

# If no match was found, print a message
if not match:
print("No match found")


# Driver Code
text = "the cat sat on the mat"
pattern = "the"

# Function call
wuManber(text, pattern)