Where ICT has RTT presentation capabilities, the ICT shall provide the ability to review previously received RTT input for the current or latest call.
NOTE 1: When the presentation is in a position for presenting earlier communicated text, the local user usually prefers that the reading position is only changed by local user action and not by arrival of new text, while when the presentation is positioned to present latest text, then arrival or transmission of new text is usually preferred to be automatically presented.
NOTE 2: Ideally, the ability to review the last call will last for at least 24 hours.
A comment from Detlev Fischer on TR 103 708 which should be considered when updating EN 301 549:
ETSI TR 103 708 V1.1.1 suggests a new requirement
6.2.2.7 Reviewing of received RTT
"Where ICT has RTT presentation capabilities, the ICT shall provide the ability to review previously received RTT input for the current or latest call."
I do not understand "current or latest call". Is "latest" a synonym for "current", or does it refer to a previous call (before the latest) that should still somehow be available? It seems that the pertinence of RTT communication would usually be limited to the active call.
The "latest" means the last finished call. How can it be called so that it is understood? It is very important that the call contents do not disappear immediately after the call, so that a slow reader can reread it. And if a new call is received before it is saved, it must not disappear then. So in fact it should be a number of calls stored.
There are a couple of problems:
It should not only be received text that is stored. The sent text should also be stored and when it is reviewed, the approximate relative time order of the text shall be presented.
There should be an "at least" inserted to make it clear that it is better to save many recent calls.
Have we agreed to use "call" which some seem to relate to just voice calls? I have since long thought about call as multimedia call, but if the EC has another view, we should follow that. I changed to "communication" here, but am ok with changing back.
Proposed changed text:
6.2.2.7 Reviewing of RTT
Where ICT has RTT presentation capabilities, the ICT shall provide the ability to review previously communicated RTT at least for the current and latest communication.
NOTE 1: When the presentation is in a position for presenting earlier communicated text, the local user usually prefers that the reading position is only changed by local user action and not by arrival of new text. When the presentation is positioned to present latest text, then arrival or transmission of new text is usually preferred to be automatically presented.
NOTE 2: Ideally, the ability to review the last communication will last for at least a 24 hours long communication.
Looks good. only suggestion is to add "session" to the end so it can't be confused with "last utterance".
Where ICT has RTT presentation capabilities, the ICT shall provide the ability to review previously communicated RTT at least for the current and latest communication session.
I agree with @gregg 's proposal to add "session" for clarity. Though, as @hellstromgu , I feel "call" or "multimedia call" would be much easier to understand. But I understand the need to allign with the EC nomenclature.
A general comment on the provision and some comments on the Notes follow - reformulation proposals marked in italics:
The 6.2.2.7 introduces a difference between the RTT communication and voice communicaiton. Any voice communication can be recorded freely by the user, but there is no requirement for storage. Though, strictly speaking one could argue that such feature could address some accessibility needs.
In my opinion the RTT storage requirment/functionality should be provided in a similar manner - available to all users, but not required by default. Consequently, the following reformulation of the provision could be considered:
6.2.2.7 Reviewing of received RTT
"Where ICT has RTT presentation capabilities, the ICT shall provide the abilitypossibility to review previously received RTT input for the current or latest call."
Comments on the NOTES:
In NOTE 1 it is somewhat unclear whether the situation described concerns an ongoing communicaiton session or a review of a completed communication session. To clarify consider adding during the ongoing communciaiton session:
NOTE 1:During the ongoing communciaiton session: When the presentation is in a position for presenting earlier communicated text, the local user usually prefers that the reading position is only changed by local user action and not by arrival of new text. When the presentation is positioned to present latest text, then arrival or transmission of new text is usually preferred to be automatically presented.
I am a bit confused by NOTE 2. I assume it should indicate that any RTT communciaiton session should be available for review for at least 24 hours, not that it concerns communicaitons that are 24 hours long. Accordingly, I propose the following reformulation:
NOTE 2: Ideally, any RTT communicaiton session should be available for review for at least 24 hours.
I think we need to be clear about how such a recording can be presented. While it might be possible to record and playback every keystroke as it originally arrived (e.g. interleaved if multiple users were typing at the same time) that would probably be complex and present further UI challenges. I think what is really needed here is to be able to scroll back through the final text as it arrived without all the confusion of interleaved individual typing. That may seem obvious, but I think we should be clear.
Are you commenting about the current or previous call?
For the current call you need to store the characters with a source and a timestamp anyway so that you can rearrange the presentation when the user e.g. want to change the size of the display area or the text size. That is normal among RTT implementations.
For the storing of previous calls, you may take the freedom to store it in a more fixed format, so that it can easily be handled by usual text tools, just as we do with stored Zoom chat etc. The other option is to record it with all source and timestamps and process it with the same software as during the call for display.
I do not think that we need to detail this.
The 24 hour limit was intended to be a reasonable limit for the duration of a call. It was a result of discussions in the STF on multiparty RTT. There was a question on how far back it must be possible to scroll back during the current call and we felt that a figure was needed even if we did not want to set a limit. So 24 hours was selected.
I basically agree with the proposal from @agata with a relatively minor edit on the requirement itself. Instead of "providing the ability" or "providing the possibility", I would say that the ICT shall enable the user to review.
6.2.2.7 Reviewing of received RTT
Where ICT has RTT presentation capabilities,
the ICT shall enable the user to review previously received RTT input for the current or latest call."
Proposal to add a minimum scroll back memory in terms of the number of characters from current and last messages that can be accessed. This would be instead of the reference to 24 hours. Needs more thought! Include the ability of users to delete all or part of this buffer.
Note that the above clarification about using ~"minimum number of characters" instead of ~"minimum file size" was to be inclusive of double-byte character sets like Cyrillic or Arabic, which can represent less conceptually in the same number of bytes as a single-byte Western European character set.
Removing the 24hr minimum review time was because STF members thought this would be interpreted as a recommendation to delete all conversations after 24 hours.
As noted in the meeting - limiting by time is not good since many people who use RTT may have English as a second language -- and as they did with TTY printouts, they often get off a call and need to have someone else later explain things to them.
best if one is concerned about memory is to limit by memory or by characters - and this can be a BIG number since text takes up so very little compared to almost anything else on a device. and device memory keeps doubling while text storage size does not (or shrinks with compression).
for privacy we need to also require that people can erase part of the RTT conversation later (on their device) so that they do not leave a text trail of passwords or mother's maiden names etc. on their device after they hang up.
In discussing this within my own company, another issue was raised by the lawyers.
There could be legal implications in forcing buffering for later replay of RTT which do not arise with unrecorded voice communication. Specifically, if the RTT conversation is always recorded, it could be made subject to legal discovery in the event of a court order or police investigation, which cannot happen for a voice call unless it is recorded which is always optional for the participants.
Hence I think we should retain the option that RTT calls are NOT recorded unless all users want them to be. Even for voice communications, platforms have to warn all participants that the conversation will be recorded so they can drop out if they are not happy with it. Otherwise we could face the situation where RTT users are not invited to critical/confidential conversations because of legal risk raised by potential RTT discovery.
If you are saying -- that users (any ONE person on the call) should have the option to turn off any permanent record of a text conversation -- I agree -- as long as
it is an opt-out not an opt-in situation (that is, the person says/clicks "do not allow retention of text conversation")
It is not automatic -- the person saying don't record presses button to make request. If other parties do not agree-- then the person can not talk or hang up. But they cannot unilaterally make the decision for everyone else to not have text retained. They do not have VETO power. Just that ability to request it -and then not talk (or hang up) if the others want to continue with the text being recorded.
This is the same way it is handled for voice recording on all Video calls. A notice and then you can be quiet or leave if you don't want to be recorded.
I don't think it is ever going to be quite that simple.
At present for conferences such as Teams and Zoom, and probably for many older voice conference systems, the meeting organiser (e.g. the lawyer) can block recording (at least within the service) for all participants because they know that sensitive matters may be discussed. Also, in any case all participants in the call have to be notified that recording is taking place. If the presence of RTT means we have to put that warning on all calls that include RTT, that could have a chilling effect.
Remember that ETSI do not allow us to record our own STF calls.
We should not be diverging too far from what is done for recording of voice communications, especially if we do not have a full understanding of why those features are arranged as they are.
I'm also acutely aware that any retained text communication is FAR easier to search for keywords and other potential triggers during a legal discovery investigation than a voice recording ever would be, creating a far lower barrier to confidentiality.
I don't want RTT users to be excluded from legally sensitive discussions by their employer simply because RTT is seen as a greater legal risk to corporate confidentiality. We should try to follow the pattern used in voice communications as the mandatory approach, with anything going beyond that being optional in our requirements.
Worse, if we get this wrong, business customers may start demanding a "disable RTT for this call" feature (just as they can already disable video) which would be exactly where we don't want to go.
The regulations surrounding the recording of multiparty voice calls in Europe are primarily governed by the EU's General Data Protection Regulation (GDPR) and the ePrivacy Directive, which require explicit consent from all parties involved in the call before any recording can take place.
Interesting. M/587, Annex IV A 1 says: "Based on this request, the harmonised standards must not support any other legal requirements than those referenced in the first
paragraph of this point 1".
It would not be fair of us to fool implementors to do something that is against ePrivacy. But the sentence above forbids us to insert such considerations. Is this maybe a similiar case as for the legal interception requirement that everybody are expected to know about but not include in writing in any other standards than the legal intercept standards?
Requiring recording of RTT is a good traditional accessibility requirement which is included in many text telephony and RTT standards and procurement specs. Excluding it on request because of the ePrivacy directive can be seen as a good thing but not being our task. So, what do we do?
I'm not certain exactly what the "must not support" in Annex IV A1 is really trying to achieve. Like so much of this legal-sounding language, it is not clear. It may be trying to prevent us from including accessibility requirements designed to improve the accessibility related to another legal requirement. It would seem extremely perverse if it prevented us from highlighting/avoiding situations that are likely to lead to a breach of another legal requirement - as in this case.
It has been pointed out to me that we need to consider mixed communities when it comes to multi-party calls.
While RTT might be available to all parties, speaking/hearing people are (today) far more likely to take advantage of modern speech-to-text and AI technology which will provide on-screen text in the call for those who need it, with or without transcription/recording. So actual RTT use is likely to be limited to those who need to type to communicate. They will already be able to read what others are saying from the speech-to-text mechanisms.
Now, when it comes to record and replay, RTT will only have part of the overall "total" conversation, so the value of scrolling back through the RTT alone it may be limited since it will at best have only one side of the conversation, and at worst may be limited to a small number of interventions from RTT users among a much larger meeting. It would be more effective if the RTT content is merged with the spoken transcription, the recording and replay of which is already regulated and controlled by the call settings as directed by ePrivacy, GDPR, attorney-client rules, etc.
Having a second parallel management for RTT recording separate from that of the spoken voices seems both overly complicated for the UI, and not adding any value.
When RTT was first invented, speech-to-text was primitive and it was assumed that everyone on the call would be typing for the benefit of those who need to read rather than listen, but technology has moved on (especially over the last two years) and people are now habitually using the automatic speech-to-text capabilities of the platforms (often also with (human) language live interpretation), either for subtitling or for the transcription and AI functions (such as summarising a meeting) that are becoming commonplace.
I feel we are in danger of taking a wrong turn here, forcing the introduction of a UI feature (separate RTT replay) that
a) won't hold the total conversation, because a lot of it won't be over RTT anyway
b) will require additional control and management for legitimate (and necessary) "do not record" situations
c) will further disadvantage RTT users in comparison to their hearing colleagues
It would be much more logical to integrate RTT into the overall modern experience where typed and machine-generated speech-to-text functions are used alongside one another, and where recording and playback is based on the entire conversation rather than just the RTT component of it.
The requirement does not at all forbid integration with other text sources in the review. So your comments do not change the proposal.
You are right that automatic speech-to-text is taking over as the main mechanism for deaf /hoh persons to participate in mixed communications.
So an urgent research and development task is rapid thought-to-speech or thought-to-text generation.
Do you suggest that we should include separate requirements also on the possible merge of text technology sources?
No, I'm saying we should not be writing SHALL type requirements about recording or replay of RTT in isolation, i.e. we should not have anything like "the ICT shall provide the ability to review previously received RTT input" in our requirements without considerable justification and explanation of why that might not be appropriate or lawful in some cases.
I suspect the writers of TR 103 708 did not consider most of these real-world issues, so blindly copying (even with modification) from that TR is not the best course.
We could perhaps constrain the requirement to 1-1 calls, but even that might not be helpful since the same technology is used for 1-1 and multiparty these days. And do-not-record is a fundamental right, not a legal technicality.
Reducing the "shall" to a "should" can express the intent, but not constrain product designers from doing something better and more helpful to the users who need it, nor indicate they should deviate from European privacy law. We are not supposed to be forcing implementation choices or UI designs here, nor should be we actively creating a requirement that goes against existing law.
Can you propose wording of an exception in the requirement in a similar way as the exceptions for when not having text i/o means in the original 6.2.1.1?
However, we have a requirement on us to not mention requirements from other laws and directives. But it seems wrong also to write something that we know is against other laws. ..... How is the proper way to act here?
We are not supposed to describe other laws, but we are most definitely not supposed to include requirements that direct implementers to break those laws either. We have to create requirements that can be implemented by people who know and follow the law.
From the EAA itself:
Section I
General accessibility requirements related to all products covered by this directive in accordance with Article 2(1)
User interface and functionality design:
(k) the product shall protect the user’s privacy when he or she uses the accessibility features;
...
Section VII
Functional performance criteria
(k) Privacy
Where the product or service incorporates features that are provided for accessibility, it shall provide at least one mode of operation that maintains privacy when using those features that are provided for accessibility.
Recording of RTT for later replay which cannot be refused or controlled by other parties to the call violates privacy, the GDPR, and ePrivacy. That is simply not acceptable.
@hellstromgu, the quoted text from the EAA does not constitute "a requirement on us to not mention requirements from other laws and directives". It actually says "must not support any other legal requirements". These are very different words and may therefore mean entirely different things - but their exact meaning is definitely not clear to me.
We need to come to a conclusion with this one.
I think Loic's wording of the requirement is good. Alejandro wanted to replace review with access, but I think that the intention of the function is not so apparent anymore. There has been discussion about privacy aspects of the requirement, but no conclusion. I do not think that the current wording requires anything else than presentation and reviewing. Storing is not mentioned, so the privacy concerns should not be that problematic for this case.
Therefore, the proposal is:
6.2.2.7 Reviewing of received RTT
Where ICT has RTT presentation capabilities,
the ICT shall enable the user to review previously received RTT input for the current or latest call.
NOTE 1:During the ongoing communciaiton session: When the presentation is in a position for presenting earlier communicated text, the local user usually prefers that the reading position is only changed by local user action and not by arrival of new text. When the presentation is positioned to present latest text, then arrival or transmission of new text is usually preferred to be automatically presented.
NOTE 2: Ideally, the ability to review the last call will last for at least 24 hours.
The EC does not like the word "call" for multimedia calls. So we could use "communication" instead even if it is not equally clear. So, the whole clause with notes again:
6.2.2.7 Reviewing of received RTT
Where ICT has RTT presentation capabilities,
the ICT shall enable the user to review previously received RTT input for the current or latest communication.
NOTE 1:During the ongoing communcaton: When the presentation is in a position for presenting earlier communicated text, the local user usually prefers that the reading position is only changed by local user action and not by arrival of new text. When the presentation is positioned to present latest text, then arrival or transmission of new text is usually preferred to be automatically presented.
NOTE 2: Ideally, the ability to review the last communication will last for at least 24 hours.
I think communication is ambiguous. that could easily mean utterance. If we don't want to use "call" then we should use "session" or something besides communication.
With regard to privacy: I think that there is an assumption on a live call (not messaging but RTT) that the call would follow the same rules as for speech. And a deaf person's "speech" should be protected as much as anyone elses.
we can't enable users. They are either able or not. But we can allow the user (i.e., not prevent the user).
How about the following
6.2.2.7 Reviewing of received RTT
Where ICT has RTT presentation capabilities,
the ICT shall allow users to review previously received RTT input for the current communication session, and, except where recording of the speech is not allowed, for the last communication session.
NOTE 1:During the ongoing communication, when the presentation is in a position for presenting earlier communicated text, the local user usually prefers that the reading position is only changed by local user action and not by arrival of new text. When the presentation is positioned to present latest text, then arrival or transmission of new text is usually preferred to advance the text view automatically to show new text.
NOTE 2: Ideally, the ability to review the last communication will last for at least 24 hours.
On 2. This is critical. If recording has not been turned on for voice, or the call owner has recording blocked, that has to apply to RTT replay as well. This is exactly the same as for automatic transcription and translation. Equally, all users must warned if anyone is recording a call/session and given the option to withdraw, which would also apply to RTT replay. This is not a choice we get to make in our EN. The GDPR privacy law always trumps EAA accessibility concerns, just as it trumps all other EU legislation (such as the EU Data Act).
On 1. How about "communication session"? Or (my preference) we could include a definition of "call" to clarify exactly what we mean in our document. Such as:
"call
communication session between two or more participants that occurs in real time and which can include any combination of voice, other audio, video, text messaging, screen or video sharing, subtitling, or other forms of communication"
On 3. "Allow" implies granting permission. "Enable" is probably the better word and more consistent with industry usage, at least in British/European English. I don't think anyone would equate this usage with removal of disability.
About "call". We are requested to not redefine any definitions from directives. In EECC we have in Article 2
(31) ‘call’ means a connection established by means of a publicly available interpersonal communications service allowing two-way voice communication;
It is not said that a call cannot contain other media, but in discussions in EMTEL the conclusion has been that we cannot use the term "call" for a multimedia bidirectional communication. So, in EMTEL we have ended up using "communication" because "session" is apparently also not well understood. "Communication" is not always suitable so the situation is a bit awkward.
About privacy: How do the chat services handle the privacy requirement? The communication there is usually kept for years. Are there some wording in the user agreement you need to sign to be allowed to use the chat service saying that you allow your sent text to be saved by other participants? If so, we could propose that something similar would be used for RTT.
The context for non-call "chat" is completely different, since that is an asynchronous non-real-time service, more like email than RTT. Retention is inherent in the nature of the service, and yes there are user agreements for that.
The e-privacy context here arises in real-time communications such as voice and RTT, because the expectation of privacy is very different.
Chat within a call such as a Zoom or Teams session is treated the same as the voice, as is any TTS transcription or translation that may be enabled. If someone requests recording, everyone is warned but it works. If recording is not selected, or is actively blocked (e.g. for confidential discussions) it is blocked for everything. You can't enable AI tools without transcription, so that also follows the pattern. RTT needs to behave the same way.
I do see the problem with "call". It's unfortunate that their definition is so tied to voice. In ISO we would create a local definition only for the purpose of the single document. Being unable to do so is a real problem. Heck, look at how many different ways a term like "service" is used across just the ICT industry. In cloud computing we've had the same problem with "digital platform", and we had to publish a 40 page TS to define a taxonomy to disambiguate the term (ISO/IEC TS 5928 if you are interested - I was the project lead/editor for that one).
The same topic is handled in #415 (closed) . The latest text proposal in the present issue was:
6.2.2.7 Reviewing of received RTT
Where ICT has RTT presentation capabilities,
the ICT shall allow users to review previously received RTT input for the current communication session, and, except where recording of the speech is not allowed, for the last communication session.
NOTE 1:During the ongoing communication, when the presentation is in a position for presenting earlier communicated text, the local user usually prefers that the reading position is only changed by local user action and not by arrival of new text. When the presentation is positioned to present latest text, then arrival or transmission of new text is usually preferred to advance the text view automatically to show new text.
NOTE 2: Ideally, the ability to review the last communication will last for at least 24 hours.
The number of the clause, the title and the beginning of the precondition is better in #415 (closed) in that it clearly includes the local input, so only the opportunity to review the last call is better here.
So the words about reviewing the last call are moved to #415 (closed) and this issue is proposed to be closed.
We are not looking for parity per call but parity per user. There may have been no voice really used in a call when RTT was used.
We can drop this request again. Having RTT available for some time after the call was finished can be seen as an accessibility feature, because the contents of the call may have consisted on captioning of speech which was too rapid to be read during the call, and therefore could be required to be available after the call for some time.
But I get the impression that product developers will take care of providing the function anyway since it always has been included. So, for me, we can again delete the extra words about this in the requirement and note 2 in #415 (closed) .