Here is what I have researched so far.
@mcury you ever play with Squid's store id??
Overall goal is that my photos that are already delivered can be shown as hits as they are the same photos over and over and over when I go to facebook. This will save energy and be secure as it is encrypted inside of pfsense cache anyway. However the id retrieval I am having issues with. I can store them now with storeID however I can not redeliver them.
It even comes with a small store id that is not activate that is stored in the path.
/usr/local/libexec/squid/storeid_file_rewrite
Code is followed
#!/usr/local/bin/perl
use strict;
use warnings;
use Pod::Usage;
=pod
=head1 NAME
storeid_file_rewrite - File based Store-ID helper for Squid
=head1 SYNOPSIS
storeid_file_rewrite filepath
=head1 DESCRIPTION
This program acts as a store_id helper program, rewriting URLs passed
by Squid into storage-ids that can be used to achieve better caching
for websites that use different URLs for the same content.
It takes a text file with two tab separated columns.
Column 1: Regular expression to match against the URL
Column 2: Rewrite rule to generate a Store-ID
Eg:
^http:\/\/[^\.]+\.dl\.sourceforge\.net\/(.*) http://dl.sourceforge.net.squid.internal/$1
Rewrite rules are matched in the same order as they appear in the rules file.
So for best performance, sort it in order of frequency of occurrence.
This program will automatically detect the existence of a concurrency channel-ID and adjust appropriately.
It may be used with any value 0 or above for the store_id_children concurrency= parameter.
=head1 OPTIONS
The only command line parameter this helper takes is the regex rules file name.
=head1 AUTHOR
This program and documentation was written by I<Alan Mizrahi <alan@mizrahi.com.ve>>
Based on prior work by I<Eliezer Croitoru <eliezer@ngtech.co.il>>
=head1 COPYRIGHT
* Copyright (C) 1996-2023 The Squid Software Foundation and contributors
*
* Squid software is distributed under GPLv2+ license and includes
* contributions from numerous individuals and organizations.
* Please see the COPYING and CONTRIBUTORS files for details.
Copyright (C) 2013 Alan Mizrahi <alan@mizrahi.com.ve>
Based on code from Eliezer Croitoru <eliezer@ngtech.co.il>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
=head1 QUESTIONS
Questions on the usage of this program can be sent to the I<Squid Users mailing list <squid-users@lists.squid-cache.org>>
=head1 REPORTING BUGS
Bug reports need to be made in English.
See http://wiki.squid-cache.org/SquidFaq/BugReporting for details of what you need to include with your bug report.
Report bugs or bug fixes using http://bugs.squid-cache.org/
Report serious security bugs to I<Squid Bugs <squid-bugs@lists.squid-cache.org>>
Report ideas for new improvements to the I<Squid Developers mailing list <squid-dev@lists.squid-cache.org>>
=head1 SEE ALSO
squid (8), GPL (7),
The Squid wiki http://wiki.squid-cache.org/Features/StoreID
The Squid Configuration Manual http://www.squid-cache.org/Doc/config/
=cut
my @rules; # array of [regex, replacement string]
die "Usage: $0 <rewrite-file>\n" unless $#ARGV == 0;
# read config file
open RULES, $ARGV[0] or die "Error opening $ARGV[0]: $!";
while (<RULES>) {
chomp;
next if /^\s*#?$/;
if (/^\s*([^\t]+?)\s*\t+\s*([^\t]+?)\s*$/) {
push(@rules, [qr/$1/, $2]);
} else {
print STDERR "$0: Parse error in $ARGV[0] (line $.)\n";
}
}
close RULES;
$|=1;
# read urls from squid and do the replacement
URL: while (<STDIN>) {
chomp;
last if $_ eq 'quit';
my $channel = "";
if (s/^(\d+\s+)//o) {
$channel = $1;
}
foreach my $rule (@rules) {
if (my @match = /$rule->[0]/) {
$_ = $rule->[1];
for (my $i=1; $i<=scalar(@match); $i++) {
s/\$$i/$match[$i-1]/g;
}
print $channel, "OK store-id=$_\n";
next URL;
}
}
print $channel, "ERR\n";
}
Ok so here is what I am doing...
Custom Refresh pattern for use with StoreID Windows is disabled because I am not using it for this.
#WINDOWS
#refresh_pattern -i windowsupdate.com/.*\.(cab|exe|dll|ms[i|u|f|p]|[ap]sf|wm[v|a]|dat|zip|psf) 43200 80% 129600 reload-into-ims
#refresh_pattern -i microsoft.com/.*\.(cab|exe|dll|ms[i|u|f|p]|[ap]sf|wm[v|a]|dat|zip|psf) 43200 80% 129600 reload-into-ims
#refresh_pattern -i windows.com/.*\.(cab|exe|dll|ms[i|u|f|p]|[ap]sf|wm[v|a]|dat|zip|psf) 43200 80% 129600 reload-into-ims
#refresh_pattern -i microsoft.com.akadns.net/.*\.(cab|exe|dll|ms[i|u|f|p]|[ap]sf|wm[v|a]|dat|zip|psf) 43200 80% 129600 reload-into-ims
#refresh_pattern -i deploy.akamaitechnologies.com/.*\.(cab|exe|dll|ms[i|u|f|p]|[ap]sf|wm[v|a]|dat|zip|psf) 43200 80% 129600 reload-into-ims
#refresh_pattern -i msedge.b.tlu.dl.delivery.mp.microsoft.com/.*\.(cab|exe|dll|ms[i|u|f|p]|[ap]sf|wm[v|a]|dat|zip|psf) 43200 80% 129600 reload-into-ims
#range_offset_limit none
acl cdnsites dstdom_regex -i "/var/squid/storeid/conf/storeid_sites.txt"
store_id_access allow cdnsites
store_id_access deny all
store_id_program /var/squid/storeid/storeid_helper.php /var/squid/storeid/storeid_rewrite
store_id_children 10 startup=5 idle=1 concurrency=0
refresh_pattern ([^.]+\.)?(cs|content[1-9]|hsar|content-origin|client-download).[steampowered|steamcontent].com/.*\.* 43200 100% 43200 reload-into-ims ignore-reload ignore-no-store override-expire override-lastmod
refresh_pattern ([^.]+\.)?.akamai.steamstatic.com/.*\.* 43200 100% 43200 reload-into-ims ignore-reload ignore-no-store override-expire override-lastmod
refresh_pattern -i ([^.]+\.)?.adobe.com/.*\.(zip|exe) 43200 100% 43200 reload-into-ims ignore-reload ignore-no-store override-expire override-lastmod
refresh_pattern -i ([^.]+\.)?.java.com/.*\.(zip|exe) 43200 100% 43200 reload-into-ims ignore-reload ignore-no-store override-expire override-lastmod
refresh_pattern -i ([^.]+\.)?.sun.com/.*\.(zip|exe) 43200 100% 43200 reload-into-ims ignore-reload ignore-no-store override-expire override-lastmod
refresh_pattern -i ([^.]+\.)?.oracle.com/.*\.(zip|exe|tar.gz) 43200 100% 43200 reload-into-ims ignore-reload ignore-no-store override-expire override-lastmod
refresh_pattern -i appldnld\.apple\.com 43200 100% 43200 ignore-reload ignore-no-store override-expire override-lastmod
refresh_pattern -i ([^.]+\.)?apple.com/.*\.(ipa) 43200 100% 43200 ignore-reload ignore-no-store override-expire override-lastmod
refresh_pattern -i ([^.]+\.)?.google.com/.*\.(exe|crx) 10080 80% 43200 override-expire override-lastmod ignore-no-cache ignore-reload reload-into-ims ignore-private
refresh_pattern -i ([^.]+\.)?g.static.com/.*\.(exe|crx) 10080 80% 43200 override-expire override-lastmod ignore-no-cache ignore-reload reload-into-ims ignore-private
refresh_pattern -i ([^.]+\.)?.ubuntu.com/.*\.(deb) 10080 80% 43200 override-expire override-lastmod ignore-no-cache ignore-reload reload-into-ims ignore-private
#FACEBOOK
refresh_pattern ^http://*.facebook.com/* 10080 80% 43200 override-expire override-lastmod ignore-no-cache ignore-reload reload-into-ims ignore-private
#FACEBOOK IMAGES
refresh_pattern -i pixel.facebook.com..(jpg|png|gif|ico|css|js) 10080 80% 43200 override-expire override-lastmod ignore-no-cache ignore-reload reload-into-ims ignore-private
refresh_pattern -i .akamaihd.net..(jpg|png|gif|ico|css|js) 10080 80% 43200 override-expire override-lastmod ignore-no-cache ignore-reload reload-into-ims ignore-private
refresh_pattern -i ((facebook.com)|(85.131.151.39)).(jpg|png|gif) 10080 80% 43200 override-expire override-lastmod ignore-no-cache ignore-reload reload-into-ims ignore-private
refresh_pattern static.(xx|ak).fbcdn.net.(jpg|gif|png) 10080 80% 43200 override-expire override-lastmod ignore-no-cache ignore-reload reload-into-ims ignore-private
refresh_pattern ^https?://profile.ak.fbcdn.net*.(jpg|gif|png) 10080 80% 43200 override-expire override-lastmod ignore-no-cache ignore-reload reload-into-ims ignore-private
#FACEBOOK VIDEO
refresh_pattern -i .video.ak.fbcdn.net.*.(mp4|flv|mp3|amf) 10080 80% 43200 override-expire override-lastmod ignore-no-cache ignore-reload reload-into-ims ignore-private
refresh_pattern (audio|video)/(webm|mp4) 10080 80% 43200 override-expire override-lastmod ignore-no-cache ignore-reload reload-into-ims ignore-private
refresh_pattern -i squid\.internal 10080 80% 79900 override-lastmod override-expire ignore-reload ignore-no-store ignore-must-revalidate ignore-private ignore-auth
Here is the storeid program I am using it seems to work better and comes with a log program you can activate
#!/usr/local/bin/php -q
<?php
/*
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.
Rudi Servo
*/
/*
This is a CLI application made for PfSense and Squid 3
the idea is to use the already installed php in pfsense to do
the storeid_helper.
has of PfSense 2.2.6 php is on version 5.5.30 and Squid 3.4
Altough php has a bad reputation for being a continuous running application
it has become more and more stable since version 5.5
now with version 7.0 it is not only stable has has many performance improvements
that surpass most comon scripting languages.
So there is no problem with php running this.
Usage you can call out the script with many rewrite files to it or folders containing
rewrite rules with .conf termination.
inside the file it must have a hard tab between the match rule and and internal squid resolve
*/
#include a small config file, for debug and just in case something else comes up
include 'conf/storeid.conf.php';
if ($_DEBUG) {
file_put_contents($_LOG_FILE, 'Worker Spawn @'.date('Y-m-d H-i-s')."\n", FILE_APPEND );
}
function addRules(&$rules, $filePath) {
$file = fopen($filePath, 'r');
while (($line = fgets($file)) !== false) {
$read = preg_split('/\s+/', $line);
$rules['/'.$read[0].'/']=$read[1];
}
fclose($file);
}
$rules = array();
$size = sizeof($argv);
for ($i = 1 ; $i < $size ; $i++) {
if (is_dir($argv[$i])) {
$path = $argv[$i];
$files = scandir($path);
foreach ($files as $file) {
$p_info = pathinfo($file);
if ($p_info['extension']=='conf') {
addRules($rules, $path.'/'.$file);
}
}
} else {
addRules($rules, $argv[$i]);
}
}
if (!empty($rules)) {
$stdin = fopen('php://stdin', 'r');
$i_url = null;
while (false !== ($url = rtrim(fgets($stdin), "\n\r")) && $url!='quit') {
$found = false;
foreach ($rules as $rule => $target) {
if (preg_match($rule, $url, $matches)) {
$i_url = $target;
for ($i = 1 ; $i < sizeof($matches); $i++) {
$i_url = "OK store-id=".preg_replace('/\$'.$i.'/',$matches[$i], $i_url)."\n";
}
$found = true;
break;
}
}
if (!$found) {
$i_url = "ERR\n";
}
echo $i_url;
if ($_DEBUG) {
if (!$found) {
$i_url = "ERR - ".$url."\n";
}
file_put_contents($_LOG_FILE, $i_url, FILE_APPEND );
}
}
fclose($stdin);
if ($_DEBUG) {
file_put_contents($_LOG_FILE, 'Worker Closed @ '.date('Y-m-d H-i-s')."\n", FILE_APPEND );
}
}
ref:
https://github.com/hscbrasil/hsc-dynamic-cache/blob/master/README.md
https://github.com/mmd123/squid-cache-dynamic_refresh-list
https://github.com/rudiservo/pfsense_storeid
Here is the storeid_sites.txt
([^.]+\.)?adobe.com
([^.]+\.)?java.com
([^.]+\.)?sun.com
([^.]+\.)?oracle.com
([^.]+\.)?apple.com
([^.]+\.)?ubuntu.com
([^.]+\.)?steampowered.com
([^.]+\.)?steamcontent.com
([^.]+\.)?google.com
([^.]+\.)?gstatic.com
([^.]+\.)?facebook.com
([^.]+\.)?(akamaihd|fbcdn)\.net
Screenshot 2024-03-18 at 18.28.22.png
Example for CDN Facebook.conf
# Facebook
^https?:\/\/(fbcdn|scontent).*(akamaihd|fbcdn)\.net\/.*\/v\/.*\/(.*\.mp4) http://facebook.squid.internal/$3
^https?:\/\/fbcdn\-(static|profile)\-a\.akamaihd\.net\/static\-ak\/rsrc\.php\/((?!.*\.(?:js|css|swf)).*) http://facebook.squid.internal/static/$2
^https?:\/\/(fbcdn|scontent).*(akamaihd|fbcdn)\.net\/(h|s)(profile|photos).*\/(.*\.(png|gif|jpg))(\?.+)? http://facebook.squid.internal/$5
^https?:\/\/fbstatic\-a\.akamaihd\.net\/rsrc\.php\/((?!.*\.(?:js|css|swf)).*) http://facebook.squid.internal/static/$1
I get huge amounts of ERR in logs showing they are being logged per Squid's website.
Screenshot 2024-03-18 at 18.30.02.png
Ref:
https://wiki.squid-cache.org/Features/StoreID
http://wiki.squid-cache.org/Features/StoreID/DB
I was researching this last time and I found I also need a redirector running in Squidguard that will retrieve the stored items. Can anyone help with this?