为Perl/CGI和MySQL的网站添加搜索功能。以下是具体实现步骤:
1. 创建搜索表单
首先,在您的HTM页面中添加一个搜索表单:
<form action="/cgi-bin/search.cgi" method="GET"> <input type="text" name="query" placeholder="请输入搜索关键词..."> <button type="submit">搜索</button> </form>
2. 创建搜索处理脚本
然后,编写search.cgi Perl脚本来处理搜索请求:
#!/usr/bin/perl use strict; use warnings; use CGI; use DBI; use HTML::Template; # 创建CGI对象 my $cgi = CGI->new; print $cgi->header(-charset=>'UTF-8'); # 获取搜索查询 my $query = $cgi->param('query'); # 数据库连接参数 my $dsn = "DBI:mysql:database=your_database;host=localhost"; my $db_user = "username"; my $db_pass = "password"; # 连接到数据库 my $dbh = DBI->connect($dsn, $db_user, $db_pass, {RaiseError => 1}) or die "无法连接到数据库: $DBI::errstr"; # 准备SQL查询 - 假设有一个页面内容表 my $sql = "SELECT page_id, title, content, url FROM pages WHERE title LIKE ? OR content LIKE ?"; my $sth = $dbh->prepare($sql); # 执行查询 my $search_term = "%$query%"; $sth->execute($search_term, $search_term); # 收集结果 my @results; while (my $row = $sth->fetchrow_hashref) { push @results, { title => $row->{title}, snippet => substr($row->{content}, 0, 150) . "...", url => $row->{url} }; } # 关闭数据库连接 $dbh->disconnect(); # 输出搜索结果页面 print <<HTML; <!DOCTYPE html> <html> <head> <title>搜索结果: $query</title> <meta charset="UTF-8"> <style> .search-result { margin-bottom: 20px; } .result-title { font-size: 18px; } .result-snippet { color: #333; } </style> </head> <body> <h1>搜索结果: $query</h1> <p>找到 @{[scalar @results]} 个结果</p> <div class="search-results"> HTML if (@results) { foreach my $result (@results) { print <<RESULT; <div class="search-result"> <div class="result-title"> <a href="$result->{url}">$result->{title}</a> </div> <div class="result-snippet">$result->{snippet}</div> </div> RESULT } } else { print "<p>没有找到匹配的结果。</p>"; } print <<HTML; </div> </body> </html> HTML
3. 索引您的网站内容
为了让搜索系统能够搜索到所有内容,您需要确保MySQL数据库中存有网站的内容:
sql
Copy
CREATE TABLE pages ( page_id INT AUTO_INCREMENT PRIMARY KEY, title VARCHAR(255) NOT NULL, content TEXT, url VARCHAR(255) NOT NULL, last_updated TIMESTAMP DEFAULT CURRENT_TIMESTAMP );
4. 爬虫脚本示例
如果您想自动索引网站内容,可以使用类似这样的Perl脚本:
perl
Copy
#!/usr/bin/perl use strict; use warnings; use DBI; use File::Find; use HTML::TreeBuilder; # 数据库连接参数 my $dsn = "DBI:mysql:database=your_database;host=localhost"; my $db_user = "username"; my $db_pass = "password"; my $dbh = DBI->connect($dsn, $db_user, $db_pass, {RaiseError => 1}); # 网站根目录 my $website_root = "/path/to/your/website"; my $website_url = "http://your-website.com"; # 清空现有索引 $dbh->do("TRUNCATE TABLE pages"); # 插入语句准备 my $insert_sth = $dbh->prepare( "INSERT INTO pages (title, content, url) VALUES (?, ?, ?)" ); # 查找所有HTM文件并索引 find(\&process_file, $website_root); # 关闭数据库连接 $dbh->disconnect(); # 处理找到的文件 sub process_file { my $file = $_; my $path = $File::Find::name; # 只处理HTM文件 return unless $file =~ /\.htm$/i; # 解析HTML my $tree = HTML::TreeBuilder->new; $tree->parse_file($path); # 提取标题 my $title = $tree->look_down('_tag', 'title'); $title = $title ? $title->as_text : $file; # 提取内容 (这里简化处理,实际可能需要更复杂的过滤) my $body = $tree->look_down('_tag', 'body'); my $content = $body ? $body->as_text : ''; # 计算URL my $rel_path = substr($path, length($website_root)); $rel_path =~ s/^\///; # 删除开头的斜线 my $url = "$website_url/$rel_path"; # 插入数据库 $insert_sth->execute($title, $content, $url); # 清理 $tree->delete; }
5. 高级功能扩展
6. 权限设置
确保您的CGI脚本有适当的执行权限:
bash
Copy
chmod 755 /path/to/search.cgi
这种实现方式适合中小型网站,如果您的网站规模较大或有特殊需求,可能需要考虑更高级的解决方案,如使用专门的搜索引擎(如Elasticsearch)或外部服务(如Google CSE)。