Software : MacPhun launches Aurora HDR software with HDR expert Trey Ratcliff

MacPhun launches Aurora HDR software with HDR expert Trey Ratcliff

Posted:

HDR, or high dynamic range photography is a wildly popular technique for combining several exposures of the same scene to capture a wider range of brightness levels than you could with a single picture.

It's built into some imaging devices (like the iPhone's camera) but it's more often used post-capture by photographers who 'bracket' several exposures and combine them with software like this.

In this instance, MacPhun has teamed up with renowned HDR photographer Trey Ratcliff to produce Aurora HDR, combining its own software processing with Ratcliff's experience and passion.

How it works

HDR images can range in style from the naturalistic, so that you don't really notice the image has been processed at all, to artistic and dramatic, where the HDR effect is an integral part of the image. And MacPhun reckons it covers the whole spectrum with its range of one-click preset effects.

You're not restricted to these presets, though, because they're backed up by a wide range of manual HDR tools – each preset is simply a set of adjustments you can tweak, add to or replace at will. It also offers layering and masking options, so it's possible to apply effects selectively to different parts of the photo.

Aurora HDR

The Standard version works as a standalone application – you choose the set of images you want to blend using the Mac Finder, then export the finished HDR image as a new file. It can work with JPEG, raw and TIFF files.

The Pro version adds plug-ins for Lightroom, Photoshop and Aperture, and here you can export selected images straight to Aurora HDR.

In both cases, a window previews the images you've selected and offers checkboxes for alignment, ghosting and chromatic aberration. If you've used a tripod you probably won't need the alignment option but the others could still prove useful.

Aurora HDR will be available from November 19th on the Mac App Store at an introductory price of $39.99 (about £26/AU$%) before reverting to its normal price of $49.99 (about £33/AU$70).

Aurora HDR Pro will be available direct from www.macphun.com at a price of $89.99 (about £59/AU$126) before reverting to its normal price of $99.99 (about £66/AU$140).

See also:

Can computers tell how you're feeling?

Posted:

Introduction and video wizardry

Microsoft's HowOld site for guessing your age from a photo was a viral success. The new MyMoustache site that tells you how the moustache you're growing for the annual 'Movember' charity is coming along (and offers to give you a fake moustache if you just want to join in the fun) might not take off the same way.

But it does show off some of the new tools Microsoft has added to the Project Oxford APIs that let developers use machine learning to find faces, understand what users say and type – and now, how they might be feeling.

MyMoustache

"The emotion API detects emotions in human faces," Ryan Galgon of Microsoft's Technology and Research group told techradar. It suggests up to eight emotions that he calls 'universal' for faces detected in an image – anger, contempt, fear, disgust, happiness, neutral, sadness or surprise (or a mix of those) – and it can work with multiple faces in a picture. "We can already tell what's happening in photos and who is in photos, and now we can move beyond that, with sentiment analysis."

Imagine a photo app that automatically composites faces from multiple images so you get a family photo where everyone is smiling. "Or you could pick the best photo in an album based on whether people are smiling or not," Galgon suggests.

Detecting beards and moustaches is another of the new face recognition options that developers will be able to use. "We also have significant improvements for detecting age and gender," Galgon told us. Some of the new options are available straight away, and others will be available over the coming weeks.

Video wizardry

The existing face detection options will now work for video as well as still images, and the APIs can follow a particular person's face through a video. Initially that's about finding a face in the video, including knowing that faces don't usually disappear – so even if it's not detected in one frame it's likely to be there.

Video face detection

In time, though, you're likely to be able to do the same kind of things for faces detected in a video that you can for faces detected in photos, Galgon says – so you could detect the emotions displayed during the video and look for when they change. "The APIs we have are starting to be able to work together, like the face detection and emotion detection. The direction we're going for is to have them provide a common set of capabilities, regardless of the type of input."

Not all of the frames in video will be interesting, or fully in focus, of course. Two further new video tools in Project Oxford do image stabilisation to clean up the video (using similar research to Microsoft's Hyperlapse high-speed video) and motion detection. "The problem with motion detection is the false positives," Galgon points out. "You don't want to detect motion every time a cloud moves across the sky or a car drives past; you want to detect where there is motion in the foreground."

Motion detection

Learning new words

A new spell checking service is designed to clean up text users are typing into apps, especially on mobile devices, where it's easy to miss off a letter or put a space in the middle of words, both of which the API can fix, as well as looking at the context to catch mistakes like 'four' instead of 'for'. "There might be misspellings that can throw off the system," Galgon pointed out. "If they're looking for Chicago, typing hicago isn't going to find it."

Instead of the traditional spell check that just looks up words in a dictionary, the idea is to have the spelling API be able to deal with slang and 'informal' language. "The challenge is adapting over time when new phrases get coined or when a new startup becomes popular. So all of a sudden 'lift' is spelled 'Lyft' and it's a valid word that wasn't a word a year ago. The nice thing about making this a web service is that when we have new words and models, we update those in the back end and developers get better results for free."

The spell check API won't learn how different people misspell words (although that's a possible area of research), but you can give it specific terms for your application. Galgon suggests: "Imagine being able to build a better speller for a particular domain, you can tell the API, here's a set of our product names that might not get recognised correctly."

Audio services and powerful AI

Speaker recognition

Two audio services will be available later in the year. The new speaker recognition API will be able to work out who's talking – not just to tell people apart in an audio track, but to recognise them specifically, based on a speech model built from existing recordings. "People can enrol their voice – we let them say a phrase and build a speech model from that, then when you send audio from them we can say 'this is Ryan, or that is Mary'."

That's the equivalent of the face verification API for speech, he explains. "That tells you with two images, what's the likelihood that this is the same face in both of them. Here we can say, given this audio file and this historical audio file, what's the likelihood it's the same person speaking."

A voice is unique and apps could use it instead of a password in some situations, he suggests. "It's not as secure as chip and pin, but it's useful for apps that only need lighter authentication."

Background noise

And Custom Recognition Intelligent Services – CRIS for short – learns the acoustics of difficult environments, or the speaking style of people whose speech is currently harder to recognise, to make voice recognition more accurate.

"Right now, the speech APIs don't do a great job with kids' voices or with elderly folks or people who speak English as a second language," he explains. "They've mainly been trained with people working in an office and in an acoustic model of somewhere like a conference room. If you're at a kiosk at an airport or a baseball stadium, or you've got a mascot at a sports event and you want the system to be able to hear users and talk back to them in some way – the acoustic environment is very challenging at a sports game. There's a lot of background noise, there might be echoing."

Child voice model

It takes five or ten minutes of audio, and that takes ten or twenty minutes to process, so you can't yet do it in real-time, but CRIS can significantly improve the accuracy of the recognition.

The system can also build a model of how people speak from a couple of sample sentences, and you can add labelled phrases for unusual words – Galgon notes, "If you have player names or specific sports terms that a default recogniser isn't going to recognise."

And crucially, it's not difficult to use. "That's been a complex task that's required a lot of expertise in the past. Pretty much anyone can do this."

Ease of use

Making these powerful AI features easy enough for developers to use with a couple of lines of code is what Galgon thinks is really different about Project Oxford (which remains free while it's in preview, although some of the features are now included in the Cortana Analytics Suite, so businesses can use them to recognise customers using face verification or analyse sentiment in customer feedback on their website).

"We're going to keep expanding the portfolio and the set of APIs over time. But we've focused on making it as easy as possible for developers to use, regardless of what platform they use – you can use this for any OS, any website. People without any experience of AI could make software understand what someone was saying."

In time, he thinks we'll just expect software that has these kind of smarts built in. "These are things that are human and natural to do. Our apps and our software should be able to hear and understand the world around them."

How Microsoft's machine learning algorithms will make for smarter apps

Yahuui Blog

Thursday, November 12, 2015