Is this possibly in Xojo via a plug-in?
Save a lot of work and say: no.
Or try to record samples, send them over the wifi and playback on other computer.
If you are lucky, you get it below half a second delay.
I did think of that but you get echo of other person talking.
What about using a webapp and some JS library?
Thinking out loud here, I have no experience with webapps of JS…
EDIT: Or a htmlviewer?