Hello,
I have a ML related task to classify text as:
1. Automated or Manual
2. Preventive or Detective
The approach to do this should be as follows:
1. Classifier 1 (preventive or detective)
Classifier 2 (automated or manual)
Weightage of classifier result 60%
2. Who performs the control (if it is a human then it is manual, if it is a system then it is automated)
3. Where is the control performed
4. What is performed ? (if it contains a verb and if the verb is similar to
"automate, automatically, system performs, configure, produce, generate, RPA" then it is automated if
it contains a verb and if the verb is similar to "set up, review, validate, reconcile, manual, SOD, segregation of duty")
5. What is performed (if it contains the following words then its a preventive control: "compare", "monitor", "review", "reconcile", "track", "resolve", "identify", "log", "audit trail", "report", "dashboard", "diagnose", "escalate", "investigate", "variance", "error", "exception", "detect", "differences", "compare against", "discrepancies", "analysis", "evaluate", "recertification", "statement" else if contains "prevent", "prior to, "validate", "restrict", "approve", "sign-off", "test", "authorize", "configure", "establish", "define","develop", "protect", "prepare", "verify", "track", "process", "document", "input", "request", "before", "checklist, "quality review", "quality assurance", "restricted to", "roles and responsibilities", "polices", "limit" then it is detective control)
I have code to extract the who, what, when, where and also training data for the 2 classifiers.