LOC-10: Clients can spoof sender value



Issue Information

Issue Type: Bug
 
Priority: Critical
Status: Closed

Reported By:
Ben Tasker
Assigned To:
Ben Tasker
Project: LocalChat (LOC)
Resolution: Fixed (2018-05-13 13:04:00)
Affects Version: v0.0.1a, V0.0.2,
Target version: V0.0.2,
Components: Client , Server ,

Created: 2018-05-13 12:05:39
Time Spent Working


Description
In LOC-5 a specific issue which allowed users to sign in as SYSTEM was fixed.

However, there's still an issue which allows users to send messages as SYSTEM (or for that matter, any username - valid in the room or not).

When a sendMsg request is submitted, our payload looks something like this
{"roomName": "BenTest", "msg": "jA0ECQMCBRTd2w/mrx9g0lsBptF2OZMnwktCl+0zLdNKi/SdM7/f1c3AuA8nxMajPhNiqQB31E1Y\\n+yrdfFkWTDU0LCRZ15Zh95u8znKdgzAzfUpbP0cJQa+wm83Pc475gTaMTtwkunG9nL7m\\n", "sesskey": "[sesskey]", "user": "ben2"}


The fields sesskey and user are used to check the user is authorised to send messages into the room.

The server does not have encryption keys do decrypt the message payload. However, when decrypted (by the client) it's structure is
        msg = {
            'user': 'ben2',
            'text': 'This is a test message'
            }


When a client calls pollMsg, it will receive any queued messages effectively in the form of a JSON array, consisting of the message ID and the encrypted payload:
[[1,'jA0ECQMCBRTd2w/mrx9g0lsBptF2OZMnwktCl+0zLdNKi/SdM7/f1c3AuA8nxMajPhNiqQB31E1Y\\n+yrdfFkWTDU0LCRZ15Zh95u8znKdgzAzfUpbP0cJQa+wm83Pc475gTaMTtwkunG9nL7m']]


The client will use the value of user as set in the payload (as the other value isn't present in the response to pollMsg).

The issue with this, of course, is there's no way for the server to verify that the value of user in the encrypted payload matches that in the unencrypted version. So a user could conceivably authenticate as them, whilst spoofing their ID in the payload:
ENCRYPTED_DATA=$( do_encrypt '{ "user":"SYSTEM","text":"Authentication timed-out. Please re-enter you password" }' )

curl -v -X POST http://127.0.0.1:8090/ -H "Content-Type: application/json" --data '{"action":"sendMsg","payload":"{\"roomName\":\"BenTest\", \"msg\":\"$ENCRYPTED_DATA\",\"user\":\"ben2\",\"sesskey\":\"$SESS_KEY"}"}'


This would appear to others in the room as a system message, though you could just as easily enter anything you want to appear as a username.

This was a previously known issue, but was blocked by getting the authentication mechanisms (LOC-2) in place. Now that that's done, need to make some adjustments to prevent this from being possible.


Issue Links

Toggle State Changes

Activity


There's a trade-off to be made here.

The only way this can be avoided is to use a value that the server can see (like the unencrypted version). However, that means that a cleartext copy of the username will be sat in the database alongside the encrypted message (it actually already is, pending a decision on this issue) providing a small level of metadata in terms of who's posting regularly (and when) for as long as those messages are retained (which shouldn't be long once LOC-6 is implemented). There's no real value in encrypting that value before dumping into the DB, as the server process will hold the key to decrypt - it just provides an extra hop for an attacker, whilst incurring a lot of processing overhead in a single-threaded blocking application.

The only way to avoid that, though, is to accept that sender spoofing is a possibility and leave things much as they are (other than removing the cleartext username from the DB).

So the question is this:

Do we mind revealing a small amount of metadata (which should only be available if the server is compromised) in exchange for assurances about who the sending user actually was? Or do we accept that sender information may not be trustworthy and push it onto the user to decide whether they think a specific user actually sent them the message that they appear to have done?
My current thinking is that given the primary use-case of LocalChat - discreet, clandestine chats - it's far more important to be able to authenticate that a message came from the expected source.

If we imagine a user-base in an authoritarian and oppressive country, they'd already have a lot of things to be very careful about, and communication may already be fairly loosely worded to mitigate the risk if leaks were to occur. Expecting those users to then make a decision about whether every single message was actually sent by the stated user (especially as that decision could feasibly be life & death) seems like it's increasing risk.

There's already an element of having to trust a user is who they say they are - if user Bob joins, there's always the risk it's actually the authorities signing in as Bob. That can be addressed, to some extent, by Alice asking questions to verify the authenticity of Bob (i.e. only questions he would know the answer to) - at least, so long as bob isn't actively co-operating with those authorities.

However, if we accept that messages within the system may not be authentic, once Bob passed the test, Chuck could subsequently start spoofing his headers to appear as though Bob is sending messages (particularly if Bob has gone quiet).

That sounds like a dangerous risk to leave open.

Whilst having the usernames recorded in the clear within the server's memory isn't particularly desirable, it does feel like a much lesser risk. Having gained access to the memory (by whatever means), Chuck can then ascertain to some extent, who was talking most and roughly when they were talking (assuming messages haven't been cleared by LOC-6).

That information could be used to identify which user led the "meeting" (if there's a person in charge, they tend to talk the most) and possibly what timezone the majority of users were in (based purely on when the meeting happened). Obviously if Bob has used the username Bob19780121 there's additional information which could be gathered (though that becomes true regardless, as Bob's username will be in the users table).

The information gleaned from that metadata could possibly be used to identify a rough location for some or all of the users. Not fixing the authenticity issue, however, would allow Chuck to communicate directly with participants, whilst pretending to be someone they may take instructions from - IoW, it opens the possibility that Chuck could compel some kind of action from Alice, which could present significant danger.

I think, therefore, that the conclusion is obvious - we should adjust the API format so that the server validated username is used. The other field can be removed from the encrypted payload's structure to ensure it doesn't accidentally get used for anything in future.
btasker changed priority from 'Major' to 'Critical'
Repo: LocalChat
Host:Rimmer

commit d12630afc3930ef9814a8d464fe7fe81ecaf3997
Author: B Tasker <github@<Domain Hidden>>
Date: Sun May 13 12:51:36 2018 +0100

Commit Message: LOC-10 - Use server-validated username to define message sender

We now use the username field that's authenticated by the server when displaying who the sender was. The "user" field has been removed from the encrypted message payload as being potentially dangerous (and no longer used in any case).

This means that it should no longer be possible to spoof the sender of messages. Instead you'd need to compromise an existing user's session or password.

The trade off of this is that there's a slightly increased level of metadata in memory on the server, but having weighed the options in LOC-10 this was deemed lower risk than the alternative.

client/LocalChatClient.py | 2 +-
server/LocalChat.py | 84 ++++++++++++++-------------------------------
2 files changed, 27 insertions(+), 59 deletions(-)


View Commit | View Changes
One thing we could look at to lessen the metadata risk would be to use something like sqlcipher to interact with the in-memory database. That way it'd be encrypted (AES-256 IIRC) data in memory rather than a plaintext SQLite database.

Though, the plaintext versions would obviously swap in and out of memory periodically, so I don't know how much value it'd actually add in comparison to the overhead. Will raise a separate issue (LOC-12) to look into that.
btasker changed status from 'Open' to 'Resolved'
btasker added 'Fixed' to resolution
btasker changed status from 'Resolved' to 'Closed'