简介




Excel中存在多线程工具-通常在多个Excel实例中运行宏,或者将宏转换为可以独立运行的vbscript。但是,我经常遇到想要将多个任务委派给异步流程的项目,为此,创建Excel本身的多个实例实在是太过分了。

在遇到多个我需要的项目之后执行多个异步请求(例如,Internet抓取或放射性衰变模拟),我决定制作一个通用类,当给出异步过程时,该类可以并行执行和处理多个异步请求。例如,此Daisy Test构成一个多线程组,该组将html请求发送到B列中的所有url。这些Google搜索的第一个链接按照响应到达的顺序返回到C列。这会触发第二组(菊花链链接到第一个事件)向该URL发送Internet Explorer请求,这些请求将在D中返回。


评论;应当注意,这些请求是按顺序发送的(B1B2,...),但无序返回(首先是C2)。那是因为我的类允许线程并行运行(因此multithreading)。这些仍然仅在单个Excel线程中进行管理,但是请求是异步的,并且处于不同的进程中,因此可以有效地在其他线程中运行。

摘要

术语“线程”在这里将被宽松地使用,而不涉及实际的处理器。相反,当我说“线程”时,我指的是一个任务的处理程序(该处理程序与其他处理程序上的其他任务并行运行)

多线程设置包含一个主要的clsMultiThread父类,控制多线程集合的形状(即,在任何给定时间正在运行多少任务)以及几个clsThreadHandle类。

这些线程处理程序各自负责运行异步任务,并在每个任务完成时通知父多线程类。

内部,任务实际上由WorkerClass对象运行,每个线程一个工人。它们从其父线程接收输入参数,运行其各自的异步任务,并在完成后向其父clsThreadHandle引发一个事件。然后,线程句柄将这个事件以及任何可选的返回值传递回主clsMultiThread,该主clsMultiThread的工作是一旦完成就关闭线程,或提示线程运行其他任务。下图总结了命令链:



反馈我很关注


关于结构的一般反馈,事件处理,使用接口等。
正确退出(我不认为我现在正在这样做)
用户界面


解决问题的方法是适当且直观的(传递工作者类等)。
我是否缺少应该使用的某些功能



这也是我曾经做过的第一个项目,其主要目的是使我可以重用(以及我编写的最长,最复杂的代码)。因此,对于


编码风格的任何评论,我也将不胜感激。 br />
实现

主类Option Explicit ''' 'VBA class to run multiple asynchronous processes 'Interfaces directly with clsThreadHandle 'Requires references to: 'mscrolib.dll ''' 'THREAD GROUP SHAPE PROPERTIES Private threadGroup As New Collection 'holds all the treads Private maxThreads As Long 'maximum number of threads that can be open Private minThreads As Long '[minimum number of threads] Private iterableQueue As mscorlib.Queue 'this item holds all the items from iterator set in queue 'replaces iterableGroup, newtaskindex, taskidset Private iterableSize As Long 'number of items in iterable group or Private passesArguments As Boolean 'true if iterableGroup exists 'THREAD GROUP REFERENCES Private WithEvents threadEvents As clsHandleEvents 'Event object to raise events from each thread handle Private workerClass As IWorker 'THREAD GROUP SETTINGS Private autoQuitEnabled As Boolean 'dictates whether to quit on Complete event, should be false if daisychaining 'THREAD GROUP STATE PROPERTIES Private openThreadCount As Long 'number of threads/handles currently open Private openTaskCount As Long 'number of tasks running on those threads Private closedTaskCount As Long 'number of threads closed (failed and successful) Private successfulTaskCount As Long 'number of threads completed sucessfully Private newThreadIndex As Long 'Iterator over handles (next new handle) Private newTaskIndex As Long 'Iterator over open tasks (next thread to be started) Private taskIDset As Collection 'Dictionary mapping taskIDs to iterableGroup location "REPLACE THIS. MERGE COLLECTION JUMBLES" Private freeThreads As Collection 'holds all the free thread ids 'THREAD GROUP PERFORMANCE PROPERTIES Private startTime As Date 'Private endTime As Date 'THREAD GROUP EVENTS Public Event TaskComplete(returnVal As Variant, taskID As String, threadID As String) 'when a task is complete on a thread, maybe if failed Public Event ThreadOpened(threadCount As Long, threadID As String) 'when a thread is opened, pass the new number of threads Public Event ThreadClosed(threadCount As Long, threadID As String) 'when closed, pass closed thread ID Public Event Complete(timeTaken As Date) 'when everything is (nearly) finished Public Event Closed(timeTaken As Date) 'when entire group is closed Public Event Opened(startTime As Date) 'when entire group is closed 'PRIVATE TYPES/ENUMS Private Type Instruction 'instruction on what to do next, and any necessary arguments that can be passed threadID As String instructionBody As InstructionType End Type Private Enum InstructionType mltCloseThread mltOpenThread mltSetTask mltDoNothing mltQuit End Enum Private Sub Class_Initialize() 'Set defaults maxThreads = 5 minThreads = 1 newThreadIndex = 1 newTaskIndex = 1 autoQuitEnabled = True Set threadEvents = New clsHandleEvents Set taskIDset = New Collection Set freeThreads = New Collection startTime = Now RaiseEvent Opened(startTime) ''' 'Test space ''' End Sub Private Sub threadEvents_Closed(threadID As String) RaiseEvent ThreadClosed(openThreadCount, threadID) End Sub Private Sub threadEvents_Opened(threadID As String) RaiseEvent ThreadOpened(openThreadCount, threadID) End Sub Private Sub threadEvents_Complete(obj As clsThreadHandle, returnVal As Variant) 'called when thread becomes free 'DO NOT mark as free here RaiseEvent TaskComplete(returnVal, obj.Task, obj.Name) 'failed as boolean openTaskCount = openTaskCount - 1 closedTaskCount = closedTaskCount + 1 successfulTaskCount = successfulTaskCount + 1 'could be unsuccessful too though doInstructions obj.Name 'pass object name so it can be marked free ' If failed Then ' failedTaskCount = failedTaskCount + 1 ' Else ' successfulTaskCount = successfulTaskCount + 1 ' End If End Sub Public Sub Execute() 'check validity of user data, if valid, then execute task If iterableSize = 0 Then Err.Raise 5, Description:="You must set size argument to a non-zero value, or a non-empty iterable first" ElseIf workerClass Is Nothing Then Err.Raise 5, Description:="You must set the async class argument first" Else doInstructions End If End Sub Public Sub Quit() 'Remove any references that would prevent proper closing 'Default automatically called when openThreadCount = 0 RaiseEvent Complete(Now - startTime) Set threadEvents = Nothing End Sub Private Sub doInstructions(Optional freeThreadID As String, Optional loopcount As Long = 1) Dim instructionVal As Instruction 'mark thread free if applicable If freeThreadID <> vbNullString Then freeThread = freeThreadID 'find out what to do instructionVal = getInstruction() 'carry out instruction Select Case instructionVal.instructionBody Case InstructionType.mltCloseThread closeThread instructionVal.threadID Case InstructionType.mltOpenThread openThread Case InstructionType.mltSetTask Dim taskThread As clsThreadHandle Dim taskArguments As Variant Set taskThread = threadGroup(instructionVal.threadID) 'assign task to thread assignTaskID (taskThread.Name) 'get any arguments there may be 'mark thread as busy BusyThread = taskThread.Name 'iterate open tasks openTaskCount = openTaskCount + 1 'execute task If passesArguments Then 'pop appropriate item from queue Set taskArguments = iterableQueue.Dequeue taskThread.Execute taskArguments Else taskThread.Execute End If Case InstructionType.mltQuit 'quit then do nothing Me.Quit instructionVal.instructionBody = mltDoNothing Case InstructionType.mltDoNothing 'do nothing Case Else Err.Raise 5 'invalid argument End Select 'call self until no instruction If instructionVal.instructionBody <> mltDoNothing Then Debug.Assert loopcount < maxThreads * 3 + 5 'max loop should be open all threads then run all tasks + a little doInstructions loopcount:=loopcount + 1 'watch for infinite loop End If End Sub Private Function getInstruction() As Instruction 'function to determine what action to take next 'called until do nothing returned 'caller to doinstructions can specify a free thread in which case some parts skipped Dim results As Instruction 'variable to hold instruction and any arguments Me.printState 'Do we need to open or close threads? 'Threads free? (threads open > tasks open): If openThreadCount > openTaskCount Then 'Great we have a free thread, now use it or delete it (cos we have too many or no tasks remaining) If newTaskIndex > iterableSize Then 'already passed all tasks '[find] & close free thread results.instructionBody = mltCloseThread results.threadID = freeThread ElseIf openThreadCount <= maxThreads Then '[find] & use free thread (run a task on it) results.instructionBody = mltSetTask results.threadID = freeThread Else '[find] & close free thread results.instructionBody = mltCloseThread results.threadID = freeThread End If Else 'No threads free, either open one (if not exceeding max, and there's a task left to put on it) 'Or do nothing (can't close it if not free, shouldn't open new if no more tasks) If openThreadCount < maxThreads And newTaskIndex <= iterableSize Then results.instructionBody = mltOpenThread ElseIf openThreadCount = 0 And autoQuitEnabled Then results.instructionBody = mltQuit Else results.instructionBody = mltDoNothing End If End If getInstruction = results End Function Private Sub openThread() 'opens a thread and assigns a task ID to it Dim newThread As New clsThreadHandle 'create new handle newThread.OpenHandle Me, threadEvents 'passes parent reference which allows handle to obtain thread ID threadGroup.Add newThread, newThread.Name 'add it to the group with a new id (set by itself) openThreadCount = openThreadCount + 1 freeThread = newThread.Name 'mark as free so task can be assigned to it End Sub Private Property Let freeThread(threadID As String) 'NOT WORKING""""" 'when a thread comes free, add it to the collection freeThreads.Add threadID, threadID Debug.Print threadID; " marked as free; now"; freeThreads.Count; "threads are free" End Property Private Property Let BusyThread(threadID As String) 'when a thread is not free or is closed, mark as busy by removing from free group On Error Resume Next 'only remove ones what are there actually freeThreads.Remove threadID Debug.Print threadID; " marked as busy"; IIf(Err.Number <> 0, ", but wasn't in free group", vbNullString) End Property Private Property Get freeThread() As String 'gives up a free thread and adds it to the list freeThread = freeThreads(1) freeThreads.Remove (1) End Property Private Sub assignTaskID(threadID As String) '@Ignore WriteOnlyProperty 'assigns task ID to thread 'nb does NOT actually run the task (this is instruction stage still) Dim newThread As clsThreadHandle Set newThread = threadGroup(threadID) newThread.Task = NewTaskID Set newThread.Worker = AsyncClass End Sub Private Sub closeThread(threadID As String, Optional failed As Boolean = False) 'close thread with appropriate id Dim oldThread As clsThreadHandle Set oldThread = threadGroup(threadID) 'remove from all collections 'taskIDset.Remove oldThread.Task remove from task id set if it was in there threadGroup.Remove oldThread.Name BusyThread = oldThread.Name 'remove from free collection Set oldThread = Nothing 'iterate counters openThreadCount = openThreadCount - 1 End Sub Public Property Let Size(sizeFactor As Variant) 'property of the thread group which dictates how many processes to run in total 'size factor is either an iterable item, or an integer to dictate the size 'Check if size factor is number If IsNumeric(sizeFactor) Then 'If so, size is that iterableSize = CLng(sizeFactor) passesArguments = False 'no argument to pass to thread, just run it a load of times 'If not, *check if iterable ElseIf isIterable(sizeFactor) Then 'If so, size is size of collection from extration Set iterableQueue = New Queue iterableSize = addIterableToQueue(sizeFactor, iterableQueue) passesArguments = True Else '[if not, raise error] Err.Raise 5 'invalid argument End If End Property Public Sub IncreaseSize(sizeFactor As Variant) 'method of threadGroup which adds more tasks to the queue, and immediately runs them 'size factor is either an iterable item, or an integer to dictate the size 'Check whether size is set yet If Me.Size = 0 Then Err.Raise 5, Description:="You must set Size before you can IncreaseSize" End If 'check whether new data matches old type If IsNumeric(sizeFactor) Then If passesArguments Then Err.Raise 5, Description:="Size factor type doesn't match original type" Else 'is numeric and was numeric, grand iterableSize = iterableSize + CLng(sizeFactor) End If ElseIf isIterable(sizeFactor) Then If passesArguments Then 'was iterable and still is, great! Dim itemsAdded As Long itemsAdded = addIterableToQueue(sizeFactor, iterableQueue) iterableSize = iterableSize + itemsAdded Else 'wasn't iterble, now is Err.Raise 5, Description:="Size factor type doesn't match original type" End If Else '[if not, raise error] Err.Raise 5 'invalid argument End If Me.Execute End Sub Public Property Set AsyncClass(ByVal workObj As IWorker) 'Set the worker who carries out the tasks Set workerClass = workObj End Property Public Property Get AsyncClass() As IWorker Set AsyncClass = workerClass End Property Public Property Get Size() As Variant Size = iterableSize End Property Public Property Let autoQuit(ByVal value As Boolean) autoQuitEnabled = value End Property Public Property Get NewHandleID() As String NewHandleID = "Handle " & newThreadIndex newThreadIndex = newThreadIndex + 1 'use next one next time End Property Private Property Get NewTaskID() As String 'generates new task, saves its ID to taskIDset, then bumps the task counter along one NewTaskID = "Task " & newTaskIndex taskIDset.Add newTaskIndex, NewTaskID 'add id to map newTaskIndex = newTaskIndex + 1 End Property Private Sub Class_Terminate() 'Set threadGroup = Nothing Debug.Print "Terminating group" RaiseEvent Closed(Now - startTime) End Sub Public Sub printState() 'for debugging Debug.Print _ "State:"; vbCrLf _ ; Space(5); "Threads open: "; openThreadCount; vbCrLf _ ; Space(5); "Threads in use: "; openTaskCount; vbCrLf _ ; Space(5); "Threads marked as free: "; freeThreads.Count; vbCrLf _ ; Space(5); "Tasks remaining: "; iterableSize - successfulTaskCount; vbCrLf _ ; Space(5); "Next task index: "; newTaskIndex End Sub


对,有些代码。这是处理所有子类的主类。

 doInstruction 


其主要方法是getInstruction (调用Size)以及IncreaseSizedoInstruction

该类迭代运行;在每个循环中,类找出要执行的操作并执行(clsThreadHandle)。除非被告知不执行任何操作,否则doInstruction始终会自行调用自身,这会使调用堆栈缩小。每个循环有几个选项


打开一个线程(创建一个新的getInstruction实例并添加到可能的位置来运行任务)
关闭一个线程(退出句柄并将其从该集合中删除)
在线程上运行任务
[强制退出任务-tb已实现]
不执行任何操作(允许调用堆栈返回零)

Size方法将告诉该类


如果出现以下情况,则打开线程它没有超过最大计数,并且如果有要运行的任务
如果没有剩余的任务要运行或者如果有太多的任务请关闭线程
有一个标记为空闲的线程
如果没有可用的线程,则不执行任何操作,并且有正确数量的打开的线程

Size决定要执行的任务数


如果Size是数字,则该类将在线程上运行正在运行的任务,直到该数量的任务被运行为止。
如果For...Each是可迭代的,则该类将保持运行的任务并通过IncreaseSize传递参数通过可迭代的参数


,这允许将诸如url之类的内容作为参数传递给每个任务,甚至传递给一个范围,以使工作人员知道工作表上的何处写入其结果到



Size就像threadComplete一样;如果要将馈送任务滴加到多线程集中,这很有用(例如,您正在使用第一个clsThreadHandle事件以菊花链方式将其链接到另一个线程)。


线程句柄Option Explicit 'THREAD HANDLE BASE PROPERTIES Private eventHandle As clsHandleEvents 'Events module multithread set which handle belongs to. Called when handle state changes Private taskID As String 'holds the id of the current task Private handleID As String 'holds the id of this handle Private handleArgs As Variant 'holds any arguments that need to be passed to the task 'THREAD EVENTS Private WithEvents workerEvents As IWorkerEvents Private workerObject As IWorker 'interface to whatever worker may be passed to thread Private Sub workerEvents_Complete(returnVal As Variant) eventHandle.NotifyComplete Me, returnVal End Sub Private Sub workerEvents_Started() Debug.Print Me.Task; " started event was raised" End Sub Public Property Set Worker(ByVal workObj As IWorker) Set workerObject = workObj.CreateNew 'set worker to be a copy of the passed one Set workerEvents = New IWorkerEvents 'create event handler Set workerObject.Events = workerEvents 'pass it to the worker so it can listen in End Property Public Sub OpenHandle(multiThreadGroup As clsMultiThread, delegate As clsHandleEvents) 'called when the handle is opened, sets the reference IDs of the string and the handle, as well as parent g Set eventHandle = delegate handleID = multiThreadGroup.NewHandleID eventHandle.NotifyThreadOpened (Name) Debug.Print Name; " was opened" End Sub Public Sub Execute(Optional args As Variant) Debug.Print Task; " executed on "; Name; " with "; IIf(IsMissing(args), "no arguments", "some arguments") workerObject.Execute args 'run the event End Sub Public Property Get Task() As String Task = taskID End Property Public Property Let Task(val As String) taskID = val Debug.Print Name; "'s task was set to "; taskID End Property Public Property Get Name() As String Name = handleID End Property Private Sub Class_Initialize() Debug.Print "I'm made" End Sub Private Sub Class_Terminate() eventHandle.NotifyThreadClosed (Me.Name) Set eventHandle = Nothing Set workerObject = Nothing End Sub Private Sub workerEvents_StatusChange(statusVal As Variant) 'not yet implemented, probably unnecessary End Sub


主类创建此线程句柄类的多个实例。

 clsHandleEvents 


我选择了单个事件处理程序,而不是单个事件处理程序(就像我对clsHandleEvents所做的那样),因为


我发现为每个任务/工作对象有一个单独的线程类更容易从心理上描绘
我打算添加一个功能,使工作人员可以在其父句柄中缓存对象(例如InternetExplorer应用程序),以保存在同一线程的连续任务之间重新初始化它的操作>每个线程只有一个缓存使此操作更简单



处理事件类Option Explicit 'class to convert calls from the thread handle into events which the multi thread group can tap into Public Event Complete(obj As clsThreadHandle, returnVal As Variant) Public Event Opened(threadID As String) 'when thread is actually opened Public Event Closed(threadID As String) 'when thread is closed Public Sub NotifyComplete(obj As clsThreadHandle, Optional returnVal As Variant) RaiseEvent Complete(obj, returnVal) End Sub Public Sub NotifyThreadOpened(threadID As String) RaiseEvent Opened(threadID) End Sub Public Sub NotifyThreadClosed(threadID As String) RaiseEvent Closed(threadID) End Sub Private Sub Class_Terminate() Debug.Print "Events Terminated" End Sub


保留了对该类的引用每个线程,以便它可以引发事件到multiThread类,而无需直接持有对其的引用(我认为这会弄乱垃圾回收)

 interface 


接口

有2个IWorker类(只有IWorkerEvents实际上是一个类,但我也称IWorker一个类,与此类似示例)

IWorkerEvents构成了您可以运行的异步进程的通用模板,该模板会根据IWorker引发适当的事件

Option Explicit 'class acts as interface for any thread task 'Execute runs the task 'Events are raised by the task if it interfaces properly Public Property Set Events(ByRef value As IWorkerEvents) End Property Public Sub Execute(Optional argument As Variant) End Sub Public Function CreateNew() As IWorker End Function

 IWorkerEvents 


Option Explicit 'class holds all the events that a thread task can raise Public Event Complete(returnVal As Variant) Public Event StatusChange(statusVal As Variant) Public Event Started() Public Sub Complete(Optional returnVal As Variant) RaiseEvent Complete(returnVal) End Sub Public Sub StatusChange(statusVal As Variant) RaiseEvent StatusChange(statusVal) End Sub Public Sub Started() RaiseEvent Started End Sub

 clsMultiThread 


最后...

我不需要特别检查一些补充功能模块,但是我将包括它们,因为它们是Option Explicit Public Function addIterableToQueue(iterator As Variant, ByRef resultQueue As Queue) As Long 'function to take iterable group and add it to the queue 'returns the number of items added Dim item As Variant Dim itemsAdded As Long itemsAdded = 0 For Each item In iterator resultQueue.enqueue item itemsAdded = itemsAdded + 1 Next item addIterableToQueue = itemsAdded End Function Function isIterable(obj As Variant) As Boolean On Error Resume Next Dim iterator As Variant For Each iterator In obj Exit For Next isIterable = Err.Number = 0 End Function 执行所必需的

 worker 



测试代码

不需要此反馈东西,除了实现MSHTML的方式之外,在此处下载示例文件

在我看来,我实际上并没有包括一名工人对此进行测试。好吧,这是一个使用String请求从网页返回HTML文档的示例。它接受代表网址的Range / HTMLDocument参数,并返回imported。注意,必须为Attribute .VB_UserMemId = 0,因为根据本文的要求它为VERSION 1.0 CLASS BEGIN MultiUse = -1 'True END Attribute VB_Name = "clsHtmlWorker" Attribute VB_GlobalNameSpace = False Attribute VB_Creatable = False Attribute VB_PredeclaredId = False Attribute VB_Exposed = False Option Explicit ''' 'Basic worker object sends MSHTML GET request to webpage and returns an HTMLDocument or Nothing 'Requires reference to ' Microsoft HTML Object library (mshtml.tlb) ' Microsoft XML, v6.0 (msxml6.dll) ''' Private httpRequest As MSXML2.XMLHTTP60 Implements IWorker Private Type TWorker Events As IWorkerEvents End Type Private this As TWorker Private Function IWorker_CreateNew() As IWorker Set IWorker_CreateNew = New clsHtmlWorker End Function Private Property Set IWorker_Events(RHS As IWorkerEvents) Set this.Events = RHS End Property Private Sub IWorker_Execute(Optional argument As Variant) Started 'raise event to thread handle 'Do some task sendRequest argument End Sub ''' 'Event raising ''' Private Sub Started() If Not this.Events Is Nothing Then this.Events.Started End If End Sub Private Sub statusChange(ByVal statusText As String) If Not this.Events Is Nothing Then 'status change is not fully implemented yet in clsMultiThread, I may get rid of it this.Events.statusChange statusText End If End Sub Private Sub Complete(Optional ByVal resultPage As HTMLDocument) If Not httpRequest Is Nothing Then Set httpRequest = Nothing If Not this.Events Is Nothing Then this.Events.Complete resultPage End If End Sub Private Sub sendRequest(ByVal url As String) ''' 'Sub to open a new XMLHTTP request at a given url 'Also assigns OnReadyStateChange callback function to this class' default routine ''' If httpRequest Is Nothing Then Set httpRequest = New MSXML2.XMLHTTP60 With httpRequest 'Assign callback function to handler class (by default property) .OnReadyStateChange = Me 'open and send the request .Open "GET", url, True .send vbNullString End With End Sub Public Sub OnReadyStateChange() Attribute OnReadyStateChange.VB_UserMemId = 0 ''' 'This is the default callback routine of the class ''' With httpRequest statusChange .statusText If .ReadyState = 4 Then 'loaded If .Status = 200 Then 'successful 'mark complete and pass document Dim htmlDoc As HTMLDocument Set htmlDoc = New HTMLDocument htmlDoc.body.innerHTML = .responseText Complete htmlDoc Else 'unsuccessful Complete End If End If End With End Sub Private Sub Class_Terminate() If Not httpRequest Is Nothing Then Set httpRequest = Nothing End Sub

实现它的多线程组可以在类似codeReviewTest的调用程序类中运行。将请求发送到A1:A10中的url,从相邻列中的那些URL返回电子邮件。

 Option Explicit

'''
'This class creates and runs a new multithread instance which runs clsHtmlWorker
'When each HTMLDocument is complete, the class scans it for e-mails
'''
Private WithEvents multiThreadGroup As clsMultiThread
'clsMultiThread is async so must be  declared separately (or in a doEvents loop)
Private Const REGEX_PATTERN As String = _
"(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|""(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\[\x01-\x09\x0b\x0c\x0e-\x7f])*"")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:(2(5[0-5]|[0-4])|1[0-9]|[0-9]?[0-9]))\.){3}(?:(2(5[1-9]|[0-9])|1[0-5]|[0-4]?[0-9])|[0-9]*[0-9]:(?:[1-9]|\[0-9])+)\])"

Public Sub run()
    'urls to check for emails are in a1:a10
    htmlRequestToUrls [a1:a10]
End Sub

Private Sub htmlRequestToUrls(urlCells As Range)

    Set multiThreadGroup = New clsMultiThread
    With multiThreadGroup
        .Size = urlCells                         'set iterable, here a load of urls
        Set .AsyncClass = New clsHtmlWorker      'set async worker
        .Execute                                 'run the group
    End With

End Sub

Private Sub multiThreadGroup_TaskComplete(returnVal As Variant, taskID As String, threadID As String)

    Dim rowI As Long, colI As Long
    rowI = Right(taskID, Len(taskID) - 4)

    If returnVal Is Nothing Then
        Cells(rowI, 2) = "Error in loading page"
    ElseIf TypeOf returnVal Is HTMLDocument Then
        Dim emailMatches() As String
        emailMatches = regexMatches(returnVal.body.innerText)
        If (Not emailMatches) = -1 Then
        'no emails on page
            Cells(rowI, 2) = "No e-mail matches"
        Else
            For colI = LBound(emailMatches) To UBound(emailMatches)
                Cells(rowI, colI + 2) = emailMatches(colI)
            Next colI
        End If
    Else                                         'nothing returned
        Cells(rowI, 2) = "Error in loading page"
    End If


End Sub

Private Function regexMatches(strInput As String) As String()

    Dim rMatch As Object
    Dim s As String
    Dim arrayMatches() As String
    Dim i As Long

    With CreateObject("VBScript.Regexp")
        .Global = True
        .MultiLine = True
        .IgnoreCase = True
        .Pattern = REGEX_PATTERN
        If .test(strInput) Then
            For Each rMatch In .Execute(strInput)
                ReDim Preserve arrayMatches(i)
                arrayMatches(i) = rMatch.value
                i = i + 1
            Next
        End If
    End With

    regexMatches = arrayMatches

End Function
 


测试类将创建一个新的多线程组。该组将打开默认的5个线程,它将在每个线程上创建clsHtmlWorker的实例。它将把范围[A1:A10]转换为10个参数,每次不忙时将一次传递给每个线程上的工人。一旦运行了所有任务,该类将autoQuit-剪切对所有子类的引用,使其超出范围。

如果需要,您可以下载示例工作簿,最好与Rubberduck一起组织文件夹。测试代码在CodeReviewTestRunner中,或者只需按一下大按钮

评论

我想知道为什么我们有评论说需要引用mscorlib.dll,但我看不到从该库创建内容的CreateObject甚至`GetObject调用?这都是纯VBA,并且因为VBA是单线程的,所以所有内容都将作为一个线程运行?

@This 1)mscorlib.dll我正在使用早期绑定,并且使用私有iterableQueue作为clsMultiThread中的mscorlib.Queue是必需的。 2)是的,要完全模拟Excel中的多线程,您需要创建多个EXCEL.EXE实例。但是,该项目专门针对异步过程,因为它们不会直接在Excel中运行。确保处理都是单线程的,但是在Internet应用程序中,主要的开销是在等待响应负载。这可以异步完成,并且可以并行处理多个实例。我希望这是有道理的

我认为这是否可以非常异步地运行代码,取决于是否存在一个支持产生其执行直到某些异步工作完成的工作程序类。没有这样的工作者,任务将在执行时简单地同步完成。

@ M.Doerner恰恰是,可以分别在Excel之外使用事件和默认属性回调函数hack异步运行InternetExplorer.Application或XmlHttprequest(请参阅此文章)文章。我使用IWorker接口的目的是确保工作人员在完成时引发事件,这提醒我们,这些事件应该是异步工作人员,而不是常规例程。多线程标准例程需要这种方法

@Raystafarian如果有帮助,我已在测试区域中添加了下载文件

#1 楼

有趣的想法,做得好!

命名

我真的不喜欢这个名字。像clsMultiThread这样的名称在某种程度上具有误导性,因为如您所述,它们实际上并未提供任何真正的多线程。一个粗心的用户会期望它可以处理任何事情,并且当他们所有排队的工作痛苦地同步完成时,他们会感到失望。 ;)

此外,我们并不是真正在使用线程,而是可能在进程内运行或可能不在进程内运行的对象。您使用了MSXML2.XMLHTTP60,因此它应该在进程中运行。但是,如果我们使用类似ShDocVw.WebBrowser甚至Excel.Application之类的东西可能会导致进程外运行,则不一定如此。那不是线程。因此,在那种情况下,我们实际上是在谈论异步运行,而不是线程运行。关键是它应该传达这样的观念,即那些对象与异步运行无关。我们只是在安排异步任务以并行运行。

您在评论中询问了匈牙利符号。我个人的喜好是不要对对象使用匈牙利表示法,因此我不会使用ParallelTaskCoordinator前缀。是否添加前缀不会帮助解决有关名称空间的问题,这主要是因为良好的名称空间与逻辑分组有关,而不是根据模块的类型将它们组合在一起。我可以将HN用于私有变量,而不是使用面向公共的属性(例如模块的名称,公共属性),因为它们只是分散了它们的语义含义。名称的语义部分更为重要,良好的命名约定应支持该名称。

通常,您的命名方案似乎经过深思熟虑。我确实看到的不一致之处是,在TaskHandle方法中,您将AsyncObjectWatcher作为参数名称,然后将其分配给cls。为什么不以相同的方式调用参数,所以在编写Execute方法时应该清楚它应该是什么?

不是完全异步的,可能被主线程阻止了。

它可以独立工作。但是,我确实想提醒您注意以下事实:它在更复杂的场景中不起作用。这是一个中断的示例:

在立即窗口中执行:

任务进度。在实践中,我始终要完成5个任务,然后输出100个delegate,然后再完成其他5个任务。这告诉我,如果在其他线程中的任务正在运行时允许主线程执行操作,则这些任务将被阻塞,直到主线程变为空闲状态为止。因此,如果不小心,很容易破坏异步。

如您所见,即使撒上eventHandle显然也不足够。考虑一下,这并不奇怪,因为“事件”来自必须在UI线程中运行的同一VBA代码。因此,如果这成为问题,则必须将异步完全委派给外部进程/线程(例如,通过外部库),在该进程/线程中您可以通过事件将进度传达给VBA代码,而不是依赖于外部对象将事件引发为工作者,必须向线程句柄然后向管理器引发事件。

请注意,即使走了这条路线,您也需要具有重试逻辑才能成功引发事件,因为运行VBA代码可能会阻止进入。有机会错过一个事件,尤其是在活动率很高且投入过多的情况下。在这里,这里,这里之前已经观察到了这一点。这些示例说明了某些情况并非总是以我们期望的方式触发的情况。

如注释中所建议,更改最大线程数可能会更好。例如,将最大线程数设置为2,我们看到任务1和任务2在阻塞之前完成了90次打印Execute,然后允许任务3开始然后完成,然后再阻塞10次以上,并且休息完成。但是,这在我的系统上非常一致。在运行测试的3次中,任务3直到打印90才开始,其余的直到100打印才开始。另一方面,如果将maxthreads设置为1,则会完成一个任务,然后在允许其余任务运行之前被阻塞100次。我不希望它容易复制,因为这会受到许多因素(硬件,Windows版本,Excel版本)的影响。这只是要注意的事情,并针对这种可能性进行明确设计。

如果对您来说至关重要的是,不要被阻止,则需要考虑采用其他方法。例如,您需要一个外部.NET库来创建线程,运行任务,然后将输出写入文件。这使主线程可以随意读取它,并且当主线程需要执行某些操作时,不会自动阻塞任何生成的线程。即便如此,它仍然受到以下事实的影响:尝试生成新线程时,它可能会被阻止(因为您需要VBA来运行代码来创建一个新线程,即使它只是在DLL中调用外部函数也是如此)。 >
注意:在我所有的测试中,仅打印的Debug.Print iiDoEventsi。我已经把Debug.Print注释掉了;否则,立即窗口将溢出,并且从头到尾都看不到所有输出。

WinHttp而不是MSXML

此外,我想引起您的注意的是,您可能拥有一个Task N started实例,该实例本身就支持事件。因此,您可以声明类似Task N completed的变量,而不是监听事件。这意味着您不需要像使用Events terminated那样设置默认成员,并且如果您只是执行Internet请求,那么您甚至不需要线程集合。只需收集PrintState并收听他们的事件。

但这显然不是通用的解决方案,并且在需要工人处理设置的情况下,使用WinHttp.WinHttpRequest并不妨碍我们将其与上述解决方案一起使用。

如果不需要,请不要递归

您有一个Private WithEvents request As WinHttp.WinHttpRequest,它可以递归调用自己。但是IMPOV,没有理由进行递归。您可以通过一个简单的循环来完成相同的操作,就像我的hacky更改所展示的那样。一个更合适的解决方案可能是在循环的底部使用多个条件或一个标志变量(确保它至少执行一次)。这样就可以确保您不必担心堆栈溢出。 :

call runtest: for i = 0 to 100 : debug.Print i: doevents: next


我的印象是管理器的目的是允许我们为每个异步运行的工人创建线程。因此,我只能使用MSXML来设置WinHttp.WinHttpRequest的单个实现,这似乎很奇怪。我是否应该能够分配一群工人,然后征集他们自己的论点?我认为这将是更直观地使用管理器。像这样的东西:

Private Sub doInstructions(Optional freeThreadID As String, Optional loopcount As Long = 1)
    Dim instructionVal As Instruction

Do
    'mark thread free if applicable
    If freeThreadID <> vbNullString Then freeThread = freeThreadID

    'find out what to do
    instructionVal = getInstruction()
    'carry out instruction
    Select Case instructionVal.instructionBody
    Case InstructionType.mltCloseThread
        closeThread instructionVal.threadID
    Case InstructionType.mltOpenThread
        openThread
    Case InstructionType.mltSetTask
        Dim taskThread As clsThreadHandle
        Dim taskArguments As Variant
        Set taskThread = threadGroup(instructionVal.threadID)
        'assign task to thread
        assignTaskID (taskThread.Name)
        'get any arguments there may be
        'mark thread as busy

        BusyThread = taskThread.Name
        'iterate open tasks
        openTaskCount = openTaskCount + 1
        'execute task
        If passesArguments Then
            'pop appropriate item from queue
            Set taskArguments = iterableQueue.Dequeue
            taskThread.Execute taskArguments
        Else
            taskThread.Execute
        End If

    Case InstructionType.mltQuit
        'quit then do nothing
        Me.Quit
        instructionVal.instructionBody = mltDoNothing
    Case InstructionType.mltDoNothing
        'do nothing
    Case Else
        Err.Raise 5                              'invalid argument
    End Select

    'call self until no instruction
    If instructionVal.instructionBody <> mltDoNothing Then
Debug.Assert loopcount < maxThreads * 3 + 5      'max loop should be open all threads then run all tasks + a little
        'doInstructions loopcount:=loopcount + 1  'watch for infinite loop
        freeThreadID = vbNullString
        loopcount = loopcount + 1
    Else
        Exit Do
    End If
Loop

End Sub


由此可见,通过传入WinHttp.WinHttpRequest工厂,显然可以看到要设置要执行的任务。这不会将我们限制为仅一个特定的工作者类,它有助于查看我们为每个任务发送的参数。

将您的私有字段封装为一种类型建议您从@MathieuGuindon的书中取出一页,并使用他的方法:区分类的私有后备字段和公开的属性。这是帮助错误代码看起来错误的一种方法。

Initialize事件中的RaiseEvents

在我看来这可能很麻烦: >
对于doInstructionsAsyncClass事件内部的任何工作,我个人的规则是使代码与类完全隔离。通常,不适合在课堂外伸出手,因为它尚未完成自身的构建,因此,结果可能无法预测。在这种情况下,效果可能是良性的,因为我们仅引发带有报告参数的事件。但是,在更复杂的实现中,这可能会以意外的方式起作用。此外,它实际上并没有买任何东西,因为如果调用IWorker,我们就已经知道了,因为通常我们首先必须对其进行初始化。

您真的需要一个队列吗?

使用.NET队列来帮助设置参数很酷。但是,这意味着您有一个额外的参考,并且您已经有许多参考,这使得很难将代码分发到不同的环境。

一种方法是延迟绑定队列,方法是将其声明为对象并执行IWorker。这样一来,您就无需添加对.NET核心库的显式引用,从而无需了解.NET Framework版本,因为在这种情况下,该类在版本之间不会更改。另一种选择是只使用内置的VBA集合。 IINM,您仅使用队列来收集参数,VBA选择也可以做到这一点。这会给您类似队列的行为:

Set .AsyncClass = New clsHtmlWorker


,没有任何外部引用。 >
@Raystafarian提到,实现接口时,您不应该接受默认的this.命名。我个人讨厌,并且总是更改名称。接口的实现并不关心您使用的名称。它唯一关心的是有一个名为something的过程(检查),它具有N个参数(检查),并且每个参数都具有相同的数据类型(检查)。它甚至不查看参数的名称。因此,您应该将它们更改为更明智的设置。如果您不敢想象,请给他们打电话Initialize,这是我通常要做的。只是不要将其保留为RHS。

评论


\ $ \ begingroup \ $
这里有很多有见地的技巧;递归,默认命名,排队-有时您需要有人从不同的POV查看您的代码以发现这些问题,谢谢。我最近也一直在尝试私有字段类型,(实际上也是在偶然情况下这样做的!)。但是有几个问题:命名-您说您不喜欢这些名称,然后继续强调一些误导性的名称,但是命名是否还有其他问题(RHS除外)?我主要是纯粹以匈牙利语风格加上cls前缀,因此我所有的东西都可以轻松理解(因为没有名称空间)
\ $ \ endgroup \ $
– Gredo
18-3-29在9:08



\ $ \ begingroup \ $
可能还会被您运行的主线程测试阻止。当您描述正在发生的事情时,我最初很担心,但是我发现课程实际上正在按预期进行。前五个异步运行,但全部在同一过程中(就我使用MSXML2.XMLHTTP60而言,这是一个令人尴尬的疏忽)-它们完全占用了UI线程,因此无需打印。它们引发完整的事件,然后进入For循环,并输出0-然后从DoEvents输出接下来的5个事件,然后退出类,然后线程可以自由打印其余的数字。因此,如给定的1个线程?
\ $ \ endgroup \ $
– Gredo
18年3月29日在9:16

\ $ \ begingroup \ $
尝试注释掉PrintState的内容,并将maxThreads更改为2或类似的内容-它揭示了发生的事情,尽管我仍然不确定自己是否在想,是否想再看看吗?
\ $ \ endgroup \ $
– Gredo
18年3月29日在9:20

\ $ \ begingroup \ $
@Greedo我已经回答了您的问题,并相应地编辑了我的答案。
\ $ \ endgroup \ $
–这
18-3-29在10:49



\ $ \ begingroup \ $
这是一个很好的答案
\ $ \ endgroup \ $
–雷斯塔法里安
18 Mar 30 '18 at 11:36

#2 楼

这超出了我的专业知识,但也许添加答案会导致更多的浏览/答案?另外,那个初学者标签在做什么? ;)
首先,我要说的真的很扎实。这可能就是为什么这里没有太多活动的原因。

ByRef或ByVal参数
您可能知道的每个参数中,未声明ByVal是隐式地ByRef。我想您可能需要其中许多作为ByRef,但最好将它们明确声明为ByRef,这样更容易分辨它应该是ByRef。给定它们都是相同的类型,三个可能会使用一个新名称。也许是threadObjectthreadHandle-由您决定。
第四个是采用变体并返回布尔值,它的名称可以是testObjectiteratorGroup。您可能想重命名其中的一些
队列我怎么知道queue是什么以及使用什么方法?它不是VBA的标准参考库,是吗?我可能会在其中添加评论,解释您为什么选择以这种方式进行操作,因此没有人需要回顾所有内容以弄清楚为什么它比其他方法更好。
RHS?
clsHtmlWorker
clsHandleEvents.Complete(obj as clsThreadHandle)
clsHandleEvents.NotifyComplete(obj as clsThreadHandle)
clsMultiThread.threadEvents_Complete(obj as clsThreadHandle)
multiThreadMethods.isIterable(obj as Variant)

什么是RHS?它是一个常数吗?我可以告诉您,在您的项目中使用Html而不是HTML更好-那是什么呢?

所有这些,我什至无法真正弄清楚它的工作原理,我看到了应该将请求发送到哪里(httpRequest),但是我无法弄清楚它们是如何返回并填充工作表的,我看不到发生了什么,我认为这是异步的吗?

评论


\ $ \ begingroup \ $
非常感谢您-我在这里没有考虑过一些优点。我放置了Beginner标签,因为这是我认为值得回顾的可重用代码的第一部分,很高兴您认为它很可靠!这些RHS是直接从我引用的示例之一中获取的,但我同意,它们的味道很差。您对调试异步内容完全正确,这就是为什么我有那么多debug.print的部分原因,因为您无法单步执行操作代码。排队的好点。手指交叉我可以在代码的其他一些部分上得到一些更好的答案
\ $ \ endgroup \ $
– Gredo
18-2-15在19:39



\ $ \ begingroup \ $
我喜欢debug = true的一部分,因此您可以在立即窗口中看到正在发生的事情,并记录所有更新。有人会说这是浪费了一些精力,但是调试它无能为力。
\ $ \ endgroup \ $
–雷斯塔法里安
18年2月15日在21:28

\ $ \ begingroup \ $
RHS =右侧?
\ $ \ endgroup \ $
–所罗门·乌科(Solomon Ucko)
18年3月24日在1:56