среда, 9 октября 2013 г.

Bad Validation MVC

http://stackoverflow.com/questions/18938696/correcting-user-input-in-asp-net-mvc4



Background

We've been migrating a large amount of legacy code & systems to ASP.NET MVC forms. I've already coded up a number of CRUD type interfaces with MVC 4, using model binding, validation, attributes etc., so I'm fairly familiar with that whole paradigm. So far, all of these forms have been on our backend administration & management applications, where it pays to have very strict input validation. We are launching our first consumer facing application in MVC and are faced with a different type of problem.

The Problem

Our legacy forms in this area are the main revenue engine for our company. Usability of the consumer experience is the rule of the day. To that end, we want our forms to be as lenient as possible - the legacy system did a number of things to automatically correct user input (in entirely custom, non standard ways each time, of course). To that end, we don't so much want input validation as we want sanitation.

Examples

We ask the user for numerical inputs which have a unit of measure implied. Common ones are currency amounts or square footage. The input label is clear that they don't need to provide these formats:
What is the approximate square footage? (example: 2000)
What is your budget? (example: 150)
People being people, not everyone follows the directions, and we frequently get answers like:
approx 2100
1500 sq. ft.
$47.50, give or take
(Okay, I exaggerate on the last one.) The model that we are ultimately storing into our business logic accepts numeric input type for these fields (e.g. int & float). We can of course use datatype validator attributes (example [DataType(DataType.Currency)] for the budget input, or just having the field type be an integer for the square footage) to clearly indicate to the user they are doing it wrong, providing helpful error messages such as:
The square footage must be numbers only.
A better user experience, however, would be to attempt to interpret their response as leniently as possible, so they may complete the form with as little interruption as possible. (Note we have an extensive customer service side who can sort out mistakes on our system afterwards, but we have to get the user to complete the form before we can make contact.) For the square footage examples above, that would just mean stripping out non-digit characters. For the budget, it would mean stripping out everything that's not a digit or a decimal point. Only then would we apply the rest of the validation (is a number, greater than 0, less than 50000 etc.)
We're stuck on the best approach to take to accomplish this.

Potential Solutions

We've considered custom attributes, custom model bindings, and a separate scrubber service class that would live between the model and the database. Here are some of the considerations we've taken into account trying to decide upon the approach.

Custom Validation Attributes

I've read a number of helpful resources on this. (They have varying degrees of relevancy and recency. A lot of stuff I found searching for this was written for MVC2 or MVC3 and is available with standard attributes in MVC4.)
What I haven't found is anyone doing what I want to do, which would be changing the model value itself. I could obviously create a local copy of the value, sanitize it and provide a pass/fail, but this would result in a lot of duplicate code. I would still have to sanitize any input values again before saving to the database.
Changing the model value itself has 3 benefits:
  1. It affects subsequent validation rules, which would improve their acceptance rate.
  2. The value is closer to what will be put into the database, reducing the additional prep & mapping overhead needed before storage.
  3. In the event of the form being rejected for another reason, it gently suggests to the user "You're trying to hard on these fields."
Is this a valid approach? Is there someone out there who has used validation attributes in this way that I just missed?

Custom Model Binding

I read Splitting DateTime - Unit Testing ASP.NET MVC Custom Model Binders which focuses on custom date time input fields with custom validation & parsing done at the model binding layer. This lives a lot closer to the model itself, so it seems like a more appropriate place to be modifying the model values. In fact, the example class DateAndTimeModelBinder : IModelBinder does exactly that in a few places.
However, the controller action signature provided for this example does not make use of an overall model class. It looks like this
public ActionResult Edit(int id, 
    [DateAndTime("year", "mo", "day", "hh","mm","secondsorhwatever")]
    DateTime foo) {
Rather than this
public ActionResult Edit(
    MyModelWithADateTimeProperty model) {
Shortly before that, the article does say
First, usage. You can either put this Custom Model Binder in charge of all your DateTimes by registering it in the Global.asax:
ModelBinders.Binders[typeof(DateTime)] = 
   new DateAndTimeModelBinder() { Date = "Date", Time = "Time" };
Would that be sufficient to invoke the model binding for the date time field on the single-parameter model example MyModelWithADateTimeProperty?
The other potential draw back that I see here is that the model binder operates on a type, rather than an attribute you can apply to the standard data types. So for example, each set of validation rules I wanted to apply would necessitate a new, custom type. This isn't necessarily bad, but it could get messy and cause a lot of repeated code. Imagine:
public class MyDataModel {

    [Required]
    public CurrencyType BudgetRange { get; set; }

    public PositiveOnlyCurrencyType PaymentAmount { get; set; }

    [Required]
    public StripNonDigitsIntegerType SquareFootage { get; set; }
Not the ugliest model code I've ever seen, but not the prettiest either.

Custom, External scrubber class

This has the fewest questions for me, but it has the most drawbacks as well. I've done a few things like this before, only to really regret it for one of the following reasons:
  1. Being separate from the controller and model, it is nigh impossible to elegantly extend its validation rules to the client side.
  2. It thoroughly obfuscates what is and what isn't an acceptable input for the different model fields.
  3. It creates some very cumbersome hoops for displaying errors back to the user. You have to pass in your model state to the scrubber service, which makes your scrubber service uniquely tied to the MVC framework. OR you have to make your scrubber service capable of returning errors in a format that the controller can digest, which is rather more logic than is usually recommended for a controller.

The Question

Which approach would you take (or, have you taken) to accomplish this type of sanitization? What problems did it solve for you? What problems did you run into?
share|improve this question
 
This is a great question. I do not have experience with the more generous/loose validation you are describing, to me validation has always been very strict. One thing I would like to add is that if you are planning on modifying the user's input that you also show them what you have changed it to. When making a best guess or interpretation there is always the possibility of a mistake and you want to make sure that your interpretation matches their intention otherwise they will think there is something wrong because your application has saved a value they did not enter. –  asymptoticFault Sep 22 at 1:35
 
FWIW, I've had success using real-time client-side cleaning/validation that makes it very obvious to the user how you are modifying their data. Simple example: paste "$1,234.56" into a text field, and on blur it because "1235" showing them that you rounded the value and removed punctuation. If something gets to the server (malicious, uncovered case, script disabled) the validation is black and white based on strongly-typed fields in your model and the vanilla validation. –  Tim Medora Sep 22 at 2:36
 
The downside of said approach is that you are pretty much rolling your own validation and not leveraging much/any of the out of the box functionality. –  Tim Medora Sep 22 at 2:37
 
@asymptoticFault thanks! The majority of the information we ask for from the customer is informational only and does not often need to be exactly correct. The only information that does need to be exactly correct is their contact information, which we promptly validate by trying to contact them. (Well, there are other fields which need to be exactly correct, but we make them radio buttons, checkboxes or selects, which can only be invalid by malicious action on part of the user.) –  Patrick M Sep 22 at 4:51
 
And with regard to the client visibility, the goal is to get it close enough on the first try that the user isn't bothered by having to make corrections. The visibility of corrections is on our radar as nice to have (and has been for a long time), but testing has shown that it's not important to the typical customer, and it's certainly not as important to us as getting through the form. The general idea we've come up with is to have tri-state validation: instead of just error & pass, we introduce a 'warning' type state which does what Tim and asymptote suggest - tell the user what changed. –  Patrick M Sep 22 at 4:54 
I would take ModelBinder approach.
When form data comes in - it goes to model binders infrastructure. There you can override decimal model binder to refine input. After that you can send it to validation routines without neeed to write specific validation attributes or something like that.
Also you can use one intelligent model binder that will do type switch internaly or override ModelBinderProvider, so your code wont be bloated with ModelBinderAttribute. Here is Jimmy Bogartarticle about this. Also you will get some flexibility, because you can use attributes to declare if model uses strict or adaptive binding.
In overall, IMHO, validation attributes are not suposed to alter input. They should validate it. Model binders are in fact responsible for converting all weird stuff that comes in into something usable in your system and your third approach duplicates Model binder functionality)
Hope this helps and sorry for my english)
share|improve this answer

Комментариев нет:

Отправить комментарий