Tag Validator

This page explains Java solution to problem Tag Validator using String data structure.

Problem Statement

Given a string representing a code snippet, you need to implement a tag validator to parse the code and return whether it is valid. A code snippet is valid if all the following rules hold:

  • 1. The code must be wrapped in a valid closed tag. Otherwise, the code is invalid.
  • 2. A closed tag (not necessarily valid) has exactly the following format : <TAG_NAME>TAG_CONTENT</TAG_NAME>. Among them, <TAG_NAME> is the start tag, and </TAG_NAME> is the end tag. The TAG_NAME in start and end tags should be the same. A closed tag is valid if and only if the TAG_NAME and TAG_CONTENT are valid.
  • 3. A valid TAG_NAME only contain upper-case letters, and has length in range [1,9]. Otherwise, the TAG_NAME is invalid.
  • 4. A valid TAG_CONTENT may contain other valid closed tags, cdata and any characters (see note1) EXCEPT unmatched <, unmatched start and end tag, and unmatched or closed tags with invalid TAG_NAME. Otherwise, the TAG_CONTENT is invalid.
  • 5. A start tag is unmatched if no end tag exists with the same TAG_NAME, and vice versa. However, you also need to consider the issue of unbalanced when tags are nested.
  • 6. A < is unmatched if you cannot find a subsequent >. And when you find a < or </, all the subsequent characters until the next > should be parsed as TAG_NAME (not necessarily valid).
  • 7. The cdata has the following format : <![CDATA[CDATA_CONTENT]]>. The range of CDATA_CONTENT is defined as the characters between <![CDATA[ and the first subsequent ]]>.
  • 8. CDATA_CONTENT may contain any characters. The function of cdata is to forbid the validator to parse CDATA_CONTENT, so even it has some characters that can be parsed as tag (no matter valid or invalid), you should treat it as regular characters.
Example 1:

Input: <DIV>This is the first line <![CDATA[<div>]]></DIV>
Output: True

Example 2:

Input: "<A>   <B> </A>       </B>"
Output: False

Solution

If you have any suggestions in below code, please create a pull request by clicking here.

package com.vc.hard;

import java.util.*;

class TagValidator {
    private int index = 0;
    private String CDATA_PREFIX_START = "<![CDATA[";
    private String CDATA_PREFIX_END = "]]>";

    private String OPENING_TAG_START = "<";
    private String OPENING_TAG_END = ">";

    private String CLOSING_TAG_START = "</";
    private String CLOSING_TAG_END = ">";

    public boolean isValid(String code) {
        Stack<String> st = new Stack<>();
        int n = code.length();
        while(index < n) {
            if(index > 0 && st.isEmpty()) return false;
            else if(code.startsWith(CDATA_PREFIX_START, index)) {
                if(!processCDATA(code)) return false;
            }
            else if(code.startsWith(CLOSING_TAG_START, index)) {
                if(!isValidClosingTag(code, st)) return false;
            }
            else if(code.startsWith(OPENING_TAG_START, index)) {
                if(!isValidOpeningTag(code, st)) return false;
            }
            else index++;
        }
        return st.isEmpty();
    }

    private boolean processCDATA(String code) {
        int start = code.indexOf(CDATA_PREFIX_START, index);
        int end = code.indexOf(CDATA_PREFIX_END, start + CDATA_PREFIX_START.length());
        if(start != -1 && end != -1) {
            index = end + CDATA_PREFIX_END.length();
            return true;
        }
        return false;
    }

    private boolean isValidOpeningTag(String code, Stack st) {
        int start = code.indexOf(OPENING_TAG_START, index);
        int end = code.indexOf(OPENING_TAG_END, start + OPENING_TAG_START.length());
        if(start != -1 && end != -1) {
            String tagName = code.substring(start + 1, end);
            if(tagName.length() <= 9
                    && tagName.length() >= 1
                    && containsUpper(tagName)) {
                st.push(tagName);
                index = end + OPENING_TAG_END.length();
                return true;
            }
        }
        return false;
    }

    private boolean isValidClosingTag(String code, Stack st) {
        int start = code.indexOf(CLOSING_TAG_START, index);
        int end = code.indexOf(CLOSING_TAG_END, start + CLOSING_TAG_START.length());
        if(start != -1 && end != -1) {
            String tagName = code.substring(start + 2, end);
            if(tagName.length() <= 9
                    && tagName.length() >= 1
                    && containsUpper(tagName)
                    && !st.isEmpty()
                    && st.pop().equals(tagName)) {
                index = end + CLOSING_TAG_END.length();
                return true;
            }
        }
        return false;
    }

    private boolean containsUpper(String tagName) {
        for(int i = 0; i < tagName.length(); i++) {
            if(!(tagName.charAt(i) >= 'A' && tagName.charAt(i) <= 'Z')) return false;
        }
        return true;
    }
}

Time Complexity

O(N) Where
N is length of input string

Space Complexity

O(N) Where
N is length of input string