pkernstock
As I mentioned earlier the gpt.lua has a bug in the conditional check, which was making GPT_CHECK for all the mails.
Instead of having to wait for the next Rspamd release and the mailcow container for the fixed gpt.lua packaged in it, I downloaded the latest gpt.lua from the Rspamd GitHub, and placed it in the plugins.d to override the /usr/share/rspamd/plugins/lua/gpt.lua in the container.
It does throw a warning of not re-registering the GPT_HAM and GPT_SPAM symbols, but it does load the updated one from the plugins.d.
That solved my first issue to be able to hit GPT check only when BAYES hasn’t been able to figure out a score.
So now, couple of updates over 24 hours of observation (and after the changes I put in) - the BAYESIAN learning (from the GPT output) has been really great, and now the number of requests actually going to GPT evaluation are far few - mainly the ones that are with lower scores like 0 ~ >-2.0.
This reduction in the API hit is nearly 70% while I have seen learning go up by over 50%.
That is a great outcome and I am guessing over a longer period of time the actual OpenAI API use will be only to periodical improve the learning/training the BAYESIAN.
Now for the conditional email scanning - this I figured is done by overriding the condition function, mainly replicating the checks, plus having additional checks where you could add:
- sender checks
- recipient checks
- subject checks
- specific content check
You can check this in the gpt.lua sourcecode, in the default_condition function, from line 125, which gives you access to the task object.
rspamd/rspamdblob/5ccf9bc7fb353c2bf20f7eb44feb283d4720bbdd/src/plugins/lua/gpt.lua
At the moment, I think these checks will be limited to config entries for the items to be checked, until I can figure how to extend this and write a plugin myself.
TBH, I haven’t yet done the above changes, as I want to observe further on the learning/training aspect with the GPT first.
Composites you mention are basically ability to use multiple rules which have been attributed symbols and treat them as a separate rule in itself with a separate weight. It doesn’t directly offer evaluation or attaching symbols to the message, without actually dipping into the message (task object as explained above).
The function enhancement I mentioned above on the other hand offers direct access to the message object and all the fields including the MIME content, and allows you to attach SYMBOLS for final assessment of the score.
So you could have composites defined in the composites.conf, and use the composite symbol for additional scoring assessment, but I’m guessing that’s not what you wanted.
In short, the override for the condition function offers a programmatic way to control what can be evaluated via GPT and what can be bypassed.
I hope I’ve been able to address your question.
p.s: I am not a lua programmer, but I learn quick and it’s more or less like python. 🙂