hs filtering for individuals, 28/4/25

What’s “third bucket”?

Individuals, social media users

Want to decrease vulnerability to hs and toxicity

How would the tool be used?

As a filter to the social media feed

As a filter for comments that others can leave under our user’s posts (that’s more difficult, is it possible?)

Current tools that social media have

Problems with the tools mentioned above

There are some AI moderation tools that individuals can use - to be looked at in more details!!

Aspects important when designing the tool for individual users (rather than communities):

Lower cost

Less training data

The tool should work reasonably from the beginning

Lower chance that the user will want to validate the model (meaning: provide feedback on whether classification results were correct or not)

The consequences of the aspects above:

Higher need to have hs datasets to be used as starting data for fine-tuning

It might be practical to start with potential users with specific profiles → in line with datasets that we have

Use Llama, definitely not OpenAI

For adaptation: use as much prompt engineering as possible, and less fine tuning and rag

Consider: prepare several categories connected to several fine-tuned models that can be used as starting points for individual users

Important question: What can we offer that will be unique, and considerably better than existing options?

→ designing with lived experience in mind… but what does it really mean for individuals?

→ consider the ethos of democratising AI, empowering individuals

An idea for user pooling:

What if users who are in the same categories or have the same or similar profiles would feel stronger as part of a group with a common goal?

What if they felt that they would be willing to offer some of their data if they felt that others would reciprocate, so that the model(s) they use could evolve and serve them better?

Some comments on this idea 

has to be opt-in

It could make users feel like they are in the good fight with others

It could help with the problem individuals having less data to finetune the model

Important considerations: privacy and safety!!!

user should be always able to see the data they shared

ideally, the user should always be able to withdraw the data they shared — although this is more tricky when the data has already been used for finetuning

When thinking about sharing data and privacy - the starting point is to assume that all data contain vulnerable / identifiable element! So sharing does not mean showing to others, but submitting to the model finetuning

When we have ways that can ensure safety, we can give option to submit data that can be seen by others

Fine tuning vs rag? that might depend on the use case

This option needs to be introduced in such a way that the users feel in control and feel that they can trust the tool - otherwise it could fail or even backfire!

hs filtering for individuals, 28/4/25

What’s “third bucket”?

Individuals, social media users

Want to decrease vulnerability to hs and toxicity

How would the tool be used?

As a filter to the social media feed

As a filter for comments that others can leave under our user’s posts (that’s more difficult, is it possible?)

Current tools that social media have

Problems with the tools mentioned above

There are some AI moderation tools that individuals can use - to be looked at in more details!!

Aspects important when designing the tool for individual users (rather than communities):

Lower cost

Less training data

The tool should work reasonably from the beginning

Lower chance that the user will want to validate the model (meaning: provide feedback on whether classification results were correct or not)

The consequences of the aspects above:

Higher need to have hs datasets to be used as starting data for fine-tuning

It might be practical to start with potential users with specific profiles → in line with datasets that we have

Use Llama, definitely not OpenAI

For adaptation: use as much prompt engineering as possible, and less fine tuning and rag

Consider: prepare several categories connected to several fine-tuned models that can be used as starting points for individual users

Important question: What can we offer that will be unique, and considerably better than existing options?

→ designing with lived experience in mind… but what does it really mean for individuals?

→ consider the ethos of democratising AI, empowering individuals

An idea for user pooling:

What if users who are in the same categories or have the same or similar profiles would feel stronger as part of a group with a common goal?

What if they felt that they would be willing to offer some of their data if they felt that others would reciprocate, so that the model(s) they use could evolve and serve them better?

Some comments on this idea 

has to be opt-in

It could make users feel like they are in the good fight with others

It could help with the problem individuals having less data to finetune the model

Important considerations: privacy and safety!!!

user should be always able to see the data they shared

ideally, the user should always be able to withdraw the data they shared — although this is more tricky when the data has already been used for finetuning

When thinking about sharing data and privacy - the starting point is to assume that all data contain vulnerable / identifiable element! So sharing does not mean showing to others, but submitting to the model finetuning

When we have ways that can ensure safety, we can give option to submit data that can be seen by others

Fine tuning vs rag? that might depend on the use case

This option needs to be introduced in such a way that the users feel in control and feel that they can trust the tool - otherwise it could fail or even backfire!